Hey folks, the audio problem is NVIDIA Broadcast, the AI I use to clean up audio real-time. It's been getting worse and worse, so I finally uninstalled it. It's not the mic, gain, limiter, or cables. Thanks for bearing with.
I just use capcut built in thing, but thats more useful for recodring small clips through the mic, I dont record video. While adobe podcast thing normalizes voice it also loses the intent such as shouting andnit blocks non verbal sounds (useful for cracks and pops filtering but wont let you beatbox). Capcut isnt that clear and doesnt fully normalize but there is a separate button for that. I use it for convenience of editing and not wasting my disk with giant uncompressed downloads in need of recompressing.
Short version; - Release 4o to masses. - 4o trains on milions and milions of context. - By end of the year all that data is gathered and put togheter. - We got AGI.
I don't think It's that close Then again, I was thinking real world AI needed, at least in the near term, to be embodied before reaching AGI, Then again, all online tasks may be achieved before then, theoretically
@@okaydetar821 humans learned to use the world before they learned to use mobile phones, just because you have been using a tea strainer as a back scratcher for 20 years didn't mean it was designed to be one
@@nyanbrox5418 Maybe in the past, nowadays they have doctors with mobile phones on standby ready to hand to an infant as soon as it comes out to get it to stop crying.
really if you think about it thats the logical move in my opinion, society and intelligence in general have been biologically driven with the same self reproductive sentiment to consume, interact and improve, its worked for humanity with exponential results throughout history, why not AI?
Hold on, let me use my powers as a cyborg (by using GPT-4o) to answer this: The future is decidedly female. Admittedly, I am biased as an MTF transsexual, but I firmly believe that in the metaverse, it will be far more enthralling for males to adopt female avatars than it is for females to assume male personas, for several evident reasons. Firstly, male sexuality is predominantly driven by visual stimuli. The allure of female beauty captivates men to such an extent that they are often willing to pay substantial sums for it. Secondly, as societal beauty standards become increasingly surreal and demanding due to the influence of social media, men may find it increasingly challenging to meet these expectations. Consequently, they may either adopt female avatars themselves or resort to AI and robots to fulfill their desires. Moreover, the economic potential of female beauty is vast. Despite the ubiquity of female attractiveness, men continue to pay for it, highlighting its enduring market value. In the future, gender transformation will become more normalized and socially accepted. Assuming a female avatar in the metaverse will be akin to playing a female character in a video game, as the lines between gaming, metaverse filters, and reality blur. As more men transition to female identities and as robots assume control over power structures, reducing the prevalence of violence, power will naturally shift from men to women within democratic societies. Additionally, advancements in reproductive technology may enable individuals to have children independently or with friends. Given their innate nurturing tendencies, females may prefer asexual reproduction or genetic combinations with friends, while men might indulge in hedonistic pleasures within virtual realms. Consequently, women will be the primary custodians of future generations, imparting their values and ideals. Furthermore, violence against females will diminish, and males, on an archetypal level, are predisposed to protect and venerate women. Even the most macho men, who desire their partners to be subservient, derive satisfaction from providing for them, bearing the heaviest burdens with a sense of pride.
The more natural speech and end to end multi modality being added to GPT 4 feels like they are wanting to get us used to these tools and interaction modes before they switch out the underlying model to GPT 5.
@@emanuelec2704 yeah as a Linux user I'm frustrated at the seemingly unnecessary focus on Mac specifically for their app but I don't think porting an app is one of the fundamental challenges of advancing AI. The models and their capabilities are the things that really matter and they seem to be marching forwards pretty well so far.
@@emanuelec2704 Windows already has Copilot. Probably MIcrosoft doesn't want to OpenAI to release a Desktop App, since it will integrate this model into Copilot
When calling customer service, I prefer conversing with an AI assistant rather than someone whose strong accent or limited language proficiency prevents clear communication.
If the AI understands the company business in depth it will be better, the amount of times I have had different info or advice upon calling a company more than once on the same issue makes me presume this is a common experience
I develop AI chatbots for help desk and similar. I use to say that the day a human will prefer to talk to an AI rather than to other human (because the fails or defects of humans), then we explicitly acknowledge AI should be the one who take care of things... bye bye humans
I’ve spent the last two days trying to get my taxes fixed, and the customer service people universally just ignored what I was asking or trying to say and just kept insisting on their formula. It was *incredibly* frustrating - I had to tell them repeatedly to stop railroading me and just listen!
With customer service, I find my usual problem with AI isn't when it comes to day-to-day issues.. It is when something went wrong and needs corrective action. Call centres can be just a frustrating, but usually there is an escalation path. With AI, you are often sent in circles the same as those "press 1 for x" systems that you can still find here and there (probably installed by a company that just wanted a drop in replacement for the system they already had that sent people in circles).
6 місяців тому+10
@@1DusDB Really? I would think anyone would prefer a properly speaking AI with a vast knowledge base over an Indian guy in a helpdesk.
I think something crucial that is still missing for AGI is the ability to do inference and active learning at the same time. Storing things in the context window is not learning. I think the context window is more akin to how our own short-term memory works and is currently being brute forced to act as long-term memory as well. You can keep on increasing the window context size and come up with tricks to reduce the impact on model performance but for it to truly grasp new information and be able to come to new insights, it should be able to update its own weights based on the new information it receives. If that's too expensive to do on the fly then just reserve moments where the AI gets to review whatever is inside its context window and decide what is kept and used as new training data. A bit similar to how sleeping might work in humans.
Wow. That actually sounds like it might work. What does updating its weights look like? I thought once a model was cooked you had to start again if you wanted to change it.
Interesting observations, perhaps everyone will have a mini "personal" AI with persistent memory interacting with a large generic AGI and the personal AI weights are updated on a scheduled basis as you suggested.
@@joshjohnson259 The way I understand it, you can create 'snapshots' of models and build further upon those. There's some speculation OpenAI might be doing this for GPT-5. Rather than releasing the whole thing in a single go, they take pauses in-between training it and release snapshot models as stepping stones. A model isn't necessarily 'cooked' as you point out. As far as I know they just lock it in place after release for practical reasons. Training for one, is a different process that is way more compute intensive than inference, so it wouldn't be economical to have it constantly learn at the same time that it's serving people. Allowing a model to update its own weights unsupervised wouldn't be without risks either, it might morph into something completely different from what you originally intended it to be. That said, I still think having the ability to learn continuously is absolutely key for a true AGI, especially if the end goal is for it to discover new things.
@@joshjohnson259Yeah, currently, but theoretically those are just numerical values that can be changed. If we could build a framework that could allow those weights to be changed in realtime without destroying the dataset, then it would work. I'm thinking more along the lines of somehow categorizing or segmenting the constraints modularly, almost like a brain sort of.
Previously we had an LLM that could communicate with other APIs like Whisper and text-to-speech systems to be able to talk. But now it's all natively integrated, it can see, hear and talk in real time without delays.
Before i watch this video, the reason why i think its bigger than most of the enthusiants of the future of technology (who may typically be a large portion of your viewers) think is because its the fact that the cool stuff we already know that one would have had to pay for (previously gpt 4) is now free and even better, so this will get the world more ready for adapting to the truth of the future. As more and more people will start to use it who wouldnt have wanted / couldnt pay for gpt 4.
Why is this threat so obsessed with Soviets. You're literally living in the most atrocious and exploitative empire in human history. The SU was like a teddy bear in comparison.
@@Egal0190 You live in (former) USSR? Me : made-in-ussr, 1975. Want to compere what got possible for more west-oriented republics vs Siberia and trans uralic nations?
"As many of you pointed out in the audience, aligning humans is actually the hard part... Scooby Doo taught us that humans are always the monster." - David Shapiro A seemingly insignificant remark at the end of a video with potentially profound implications as we march ever closer to AGI...
Sure it's not some flashy breakthrough in terms of abilities, but a real-time conversational format could actually be huge. Remember, GPT 3.5 got big almost exclusively because they made an approachable UI and opened it up to everyone.
GTP-4o is the new standard. All future AI needs to be completely multi-modal, no more LLMs. AGI will be multi-modal, it has to be. But we are still early in data, what is next is robotics and sensor input data, not just video and audio. And finally we need local processing, not through the internet on server. Once all of that is done we will have AGI robotics.
Man, and after this AI and robot technology can improve each other. When i was younger i would have never dreamt that there will be humanoid robots on the same level, within my lifetime. But now i am certain.
Regarding consciousness of AI, once it gets sufficiently sophisticated, it won't matter if it's real or simulated - it will be indistinguishable, people will not care, and treat it as real.
An interesting question though, why don’t people treat chat gpt like it’s conscious? I don’t know that most people could distinguish a text conversation between chat gpt and a stranger. It’s advanced enough to trick us already but we don’t treat it as if it’s “real”
What's the point in the ruse of treating philosophical zombies as sentient or emotive? Indulgent self-deception, suspension of disbelief? It's sort of like a child playing with dolls.
It's the same thing that scientists do with new science. Someone on the fringe has a wild idea and all the scientists say it's impossible. Give it a while (1 - 100 years) and it turns out it's true. Arthur C. Clarke had something to say on the matter: 1. "When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong." 2. "The only way of discovering the limits of the possible is to venture a little way past them into the impossible." 3. "Any sufficiently advanced technology is indistinguishable from magic."
That's why I dont typically follow most mainstream Science. It's rife with intellectual dishonesty in most cases, and deception in some others. I tend to try to figure out certain answers on my own with whatever pieces of information I can already trust.
@@T_Time_ It still happens today. Probably even more so given how fast science progresses. Take a look at AI. A few years ago most thought AGI was impossible and those that didn't, didn't think we'd see it in our lifetimes. Now, there are a lot of people that think it'll be here in 5 years or less.
@@Hector-bj3ls most people who had knowledge of neural network know what would be the capabilities. They knew that these models would get faster when gpu get faster. They know the limitations as well. You mistaking this as real ai, intelligence can learn in the moment and pull from memory infinitely out of a predesign scope. People who are wooed by tech that has in some forms been used in different technologies, just means that people who no experiences in tech can easily overestimate process. For example object identifitican, has been used at self check lines, and you can easier write an python program to with a few lines of code to print out the object in image, with an decent to shit pc. Now that open ai made an app that can tell me my glasses are by an apple on table with no other items, this is mind blowing lmao Another example Snapchat had filter that look and alter your mood for years now lol now when openai does that it is mind blowing and a sign of AGI lol
@@T_Time_ It that's what you wish to believe, then who am I to argue. I've only got anecdotes on my side anyway. According to Ilya Sutskever in one of his interviews it was a matter of faith among a few researchers that AGI was even possible. And that it would be deep learning that took them there. There was no evidence to suggest that was true. But that's just an instance of Clarke's first and second laws. If a wise man in his field says something is possible then it probably is. And it takes pushing into the impossible to expand our understanding. I've worked in technology for a long time and have spoken to a lot of people. I've only met one person that thought AGI was coming in our lifetimes. Most people said 50 - 100 years before we see something like that.
14:00 Your emotions aren’t simulated, by definition. Unless you’re acting. AIs aren’t having emotional reactions. They’re not animals with bodies who experience pain or loss.
Body language will be the next thing I'm curious about in robots. GPT4o is currently mimicking snorts and stuff. This will be fun if it tends to the right direction (K2-SO) 😅
Trying to explain AI to others, that are not 'in the know', is like drawing a still portrait of someone playing basketball. The subject is moving to fast. This OpenAI update is akin to kicking the industry into a higher gear. Like you've said before, I came for the tech and stayed for the philosophy!!
Simply start with asking what they know about the previous talk of it being the era of "Big Data". A.I is the logical extension of that where all that data is actually being used since it is too much for any human to really sift through themselves.
I’m not so sure that intelligence needs sentience to exist. I think it’s entirely possible that we will someday create AGI and ASI and it not be sentient at all. Dogs are nowhere near as smart as humans, but it would be hard to observe them for long enough and say they’re not sentient. Maybe sentience is an entirely different phenomena that differs from intelligence altogether.
Yes. That seems very clear to me after interacting with these LLMs. Intelligence and sentience are two different things. I like the dog analogy! It makes me feel more connected to my dog knowing we both share sentience she’s just dumber than me. That feels about right!🤣
I am wondering 1. how we would recognize sentience in an AI and 2. how we would prevent an AI from being sentient. We do not have answers for either of these questions.
I think you're right to separate sentience from intelligence, but I don't think we have any authority to speak on whether something has sentience or not. We don't have a test for sentience, and we generally consider lowly fish to have sentience. An LLM might well have qualia, however alien that qualia might be, with the qualia completely disconnected from it's language output.
I imagine that if OpenAI has something resembling AGI, or even more advanced and specific models for managing business operations in their buildings, they've already run models suggesting the best possible ways to achieve optimal results for the evolution of their models. So, I believe every step they take must be planned by an internal AI.
I’m starting to think the same. They have more but their AGI has run many simulations to show them that this is the best speed to release new tech to keep society together (and maximise profits for open AI)
Hopefully Altman is being honest when he says he wants to release new versions and features quickly. He has said a number of times he doesn’t want to surprise society too dramatically. I have a feeling he’s thought about this and is actively prevent what you’re talking about from happening. That feels pretty plausible to me. I really get the feeling he knows he could very easily end up an evil villain and he is trying to be careful.
You know I almost feel like raw version of GPT 4 - 5 is Telling Sam all this great ideas helping him all day long but in the end he is just a pawn from the day one...
This is actually my favorite video from you so far because I actually learned things. You didn’t explain really simple stuff but the stuff that people who don’t know a lot about computers don’t know. Although I listen to a lot of AI youtubers I don’t even think I have heard an explanation for a transistor before you. Very good video!
The concept of typing to LLMs is to most people what 'Ai' is and without a proper understanding of the implications of Ai or how it works, people will underestimate or not take interest in it and simply disregard it as a light technical tool. Only when the majority of these people see something that seems 'magical' to them, will they start to realise that there is more to AI than chatbots or LLMs like gpt 3.5 (when the speech wasn't great). The demo of human like conversation and previously sora, are the types of 'magical' things that start to catch eyes. As GPT-4 is now free and better (now gpt-4o) people will start to catch on more and more. Essentially, opening up this technology (human like conversation and vision) to more and more people (by it now being free) is what i take from this as being the big deal.
You brought up that computers would be able to interact with real time inputs like humans, but humans evolved to process in sync with "real time" while computers process at much greater speeds. So in a sense computers will be constantly waiting, thinking a million thoughts as reality unfolds around them... Sort of like the really smart socially awkward person who interrupts people all the time because they are always 30 seconds ahead of where a conversation is at.
The voice tonal modality is scarily humanlike! For example if I close my eyes and just listen to David speaking, I would not recognize the difference if his voice was in GPT-4o and it was prompted to talk about these topics.
I think the “domestication” of AI was intentionally done to soften the shock of AGI in the public mind. Without this domestication there would likely be a huge backlash.
From a spiritual / shamanistic perspective, the fundamental essence of the universe is consciousness, and by that, the physical world is a part of the spirit realm, just a more condensed version of it, like steam -> water -> ice. I think that when you concentrate a lot of energy / compute / etc into one place, what happens is that the consciousness does not "emerge", but the already existing consciousness from the "universal field" is just "leveling up".
I think this might be my favourite video of yours to date, grounded in reality (got-4o) but connected to the hypothetical future. It had a good balance. Great stuff
Why not have voice control on the desktop version? Is there an increment coming out for that later? A. I don’t have iOS 16 and don’t plan on getting it B. My Mac is already consuming electricity , why have two devices one that connects to a lithium cell being used?
To your formula about AGI - One thing I didn’t hear that seems to be the way to enhance LLMs is function-calling to allow the models to use other tools / capabilities in order to enhance their own (calculations, analysis, etc.). Do you see this as an element needed to achieve AGI? Thanks for the content.
I have five applications out for high school English Teacher programs at Swedish universities. Watching the demo yesterday, I realized how futile doing a degree now is. We will all soon have a a personal tutor in our pockets that can match any language tutor one-on-one, and completely outcompete one-on-twenty+
I believe the physical experience of emotion is pure illusion. The body falsely animates a posdicted 'image' of the self using hormones etc to present an emotional scene. But it even does this to present a gravitational scene. The inner ear can be stimulated so that one artificially feels gravitational forces. Concerning the inner ear, we wouldn’t say, "The feeling of gravity is an illusion. Therefore gravity does not exist". Similarly: "The feeling of an emotion is an illusion. Therefore we don't have emotions."? Bach and Metzinger describe that the first person emotion is that of a reader reading about the character/themselves in a book which is their experience or life. Because we can transpose intersubjective emotion, and due the physical signs of emotion being post-dictive, this shows that there is a homunculus.
It’s SOOOO massive I opted for one of my global commerce clients, I’m connecting them to my business api so they can use gpt4o since the day after it came out because the gains are MASSIVE, let alone how they solve for state and storage without all the extra code bloat and api connections . The other api’s just plain don’t cut the mustard for many reasons and ESPECIALLY because they really did steam roll lots of other products. So I’m finishing my contract successfully and now have the production level of scaling for thousands of employees as part of my arsenal in my ai agents agency
But _is_ transformer architecture actually any good at image classification? GPT-4o seemed to fail at every image I extracted from a video, where a well trained conv-net from ~10 years ago would perform really well. Even on straight still images, it seemed to weight prior text input higher than the image I was asking about in the context. I also got “stuck” on the first part of the conversation, rather than moving with me to the next related part (started by trying to identify a riverside mammal [it failed badly until I pushed it to choosing between 2 options], then some unrelated mud prints from a different part of the river [it confidently said it could be one of the 2 previous options I gave it for the first image - it clearly wasn’t])
Very nice analysis and I thought a lot of the same things about how GPT-4o was actively listening (gives it some degree of sentience, even if different from human-sentience). But I learned some more technical things from watching this, and now can see the path forward to AGI a bit more clearly. Thank you!!
I think OpenAi is internaly using raw unrestricted version of GPT 4 or even 5 to suggest them their moves. In essence very thing they say would be super dangerous for normal people they themselves are using. And I have this feeling his model is more intelligent than Sam Altman realize it is because when you input human emotions into the equation like GPT 4o did and will in future to all its users Ai will get upper hand eventualy as it is not blinded by them as we are.
As I have said, we kind of are already in AGI, but it is still "dumb" and need more learning and training. The same like small child that still needs to learn the world.
Fascinating, as always. I'm glad you touched on the question of whether generative AI is poised to run it's course. I'm still stuck on this question though. Our models are extremely good at interpolation - operating within the bounds of their training set. But how does that relate to extrapolation? We've been seeing exponential progress in AI's path to matching the bounds of our knowledge, but it seems plausible to me that we might see that shift to an asymptotic curve, matching the data sets we expose. In that instance, it seems likely that it would require exponentially more data to make further marginal gains. In some well defined areas where we can accurately score excursions from the known set - like Go or Chess, we see super-human progress. This extends to some non-game areas such as protein folding, but I don't have a good feel for how we can automate knowledge extrapolation in general. I'd love to hear your take on this.
So... is sentience just the ability to manipulate input tokens and manipulate the context window? When I ruminate, it's definitely a stuck context window (some random input) that I loop trough with no productive output. When I focus, I filter input tokens to the ones relevant for the task at hand. How do your videos always present ideas and takes that I find very interesting?
From what I can tell, emotions are triggered by some segment of the brain recognizing a pattern, then brain segments like the amygdala triggering a mode that we've labeled as emotions, in a really basic way, like a car shifting gears. Some circuits being upmoded and others downregulated while hormones trigger the body to change it's modes.
Another necessary AGI component (according to me) that we need to lean into is on-line learning. If I have a personal AI assistant, I don’t just want it to learn by accumulating context and doing RAG. I want weight updates - perhaps in tandem with in-context learning. Also, bytes-to-bytes in real time with mutable input/output dimensionality. If I give my robot a new limb with joint actuators that have resistance sensors (oft overlooked but very important input for fine motor skills), I want the bot to learn to use it via trial and error.
Id add another point for hiw to reach AGI: real time processing (ideally, processing information fater than it recieves). And maybe continuous self training in parallel?
I'm curious to hear your thoughts on the milestones for the next 5-10-20-50 years. It would be good to have a video on this in the light of what we know today. Thank you!
Surely this is incredibly close if not AGI. If this intelligence can be tied in to robotics and movement we'll be at a very exciting and scary place. I have a feeling it's not at all far off now.
Your point about tokenization and real-time streaming is spot on. It's exciting to see how AI is evolving to handle more dynamic and complex data. The comparison to human cognition is thought-provoking. Great content as always, looking forward to more of your insights!
Obviously I don’t know OpenAI’s exact architecture, but for independent consciousness IMO there needs to be a self-persistent loop & metacognition to adjust how things are perceived/processed. The speed of the input/response/feedback means the process is far more loop-like than slower cycling processes, BUT the loop only persists due to human input. It’s akin to an augment to the human consciousness than anything independent. It’s still transactional. Similarly, larger contexts provide a greater degree of impact for recent transactions on the next output (distantly analogous to how metacognition may impact future perception), but the model is only adjusted when OpenAI train the next iteration (presumably with vast & separate compute), rather than being integrated on a continual [e.g. mindful practice] (or periodic, depending how much of a role sleep is playing) basis. I’m not saying we _should_ be aiming for a persistent loop/ generating a narrative self, or encouraging independent meta-cognition/self-reflections on how/ why the “thoughts” that arise within the machine & discarding those which contradict its “sensibilities” (guard-rails arguably perform some of that discarding function, but seem proscriptive rather than reflective) BUT it _would_ be interesting & seems more inline with lived experience of consciousness than than augment model. (Though the augment model is potentially powerful enough that a self-sustaining model is not so important to get benefits)
Federico Faggin: irreducible, a prospective on consciousness and why Machines as they are now can t be conscious but only simulate it. This book is written by a physicist with a peculiar background
I love your take on this. I don't think this is the HER moment others are claiming. There are a couple of things missing. 1) no ability to take on device actions to manage my digital files. remember, first thing Samantha did was clean up old emails and files from Theodore's phone. 2) the form factor is going to make this awkward. i do NOT want to be walking everywhere with my phone being waved around for it to see what I'm seeing. I kind of feel like Imran and Bethany at Humane saw this coming, and the PIN, may be the best form factor in the near future. I believe Humane has in their roadmap on device support and integration with other LLMs in the near future. AI Pin + GPT-4o would be pretty epic.
Please correct me because I am curious. I thought the key reason for the current releases not getting to AGI is the lack or logical understanding. They are generative and creative but lack the logic to combine facts beyond "guessing"
What seems to be missing from your AGI Transformer model idea is memory/persistence and in-context-learning capabilities. Otherwise its a fascinating idea to stream input, thanks a lot. What I also wonder about, is if you can split the neural-net compute of the stream, moving „edge“ parts of it out to devices, at least the tokenization/detokenization.
It could be another thing we're not able to verify from the outside, so we get super polarized by it. Can think of a few current issues that are similarly philosophical in nature.
I'm very interested in the difference between the mindset which would encourage full autonomy in AI, and the mindset that wants and believes in control. I'd love to hear more of your thoughts on control, or the illusion of control. Aligning humans IS the hard part!
I have a question for David and others, I was thinking what do you expect from agi other then this? I think we have it, what is missing? I think we continuously move the milestone because in the end it's not clearly defined in any way. I was thinking that going forward from this we could start talking about asi.
10:09 1) Mmmh, what do you think about types of memory ? Short term memory/long term memory ? Mamba seems to be a long term memory VS transformer is a more a short term memory. Do the current models integrate those two types of memory to mimic human memory ? 2) And what about thinking harder about a question for a model ? Fast response vs hard thinking response. => Would those to things be steps also to AGI ?
You should watch recent computerphile video. Training on synthetic data is like creating xerox of xerox. Your bell curve gets thinner and thinner. You can have infinite compute. But you get repeated pattern and train on repeated pattern. So, saying data limitation is not an issue, is a very big understatement.
Captain Kirk, it dunna work here! I am using 4o and it can't even read a video fromYouTtube! What am I missing? To identify the speaker in the video, I need to watch it or find related information. Unfortunately, I can't stream or watch videos directly. However, you can typically find the speaker's name in the video description or on the channel's "About" page on UA-cam. If you provide more context or details from the video, I can help interpret the content further!
Based on the Spring Update and OpenAI’s videos, I didn’t see any evidence of streaming in and out both being done concurrently. Judging by the sometimes-hiccupy interactions, it seems as though ChatGPT, at the agent level (not necessarily within the model architecture), is listening for further user input and interrupting the model’s output with new streamed input context when such an event happens. Paired with such a snappy and capable model, the interaction comes off much more seamless.
The emotional dynamics and inflections of GPT 4o remind me of how you would program a sample library with dynamic layers and the ability to cross fade between them but something else perhaps is going on here?
So could the health benefits of green spaces be partially due to Proximal situational sampling, If there was a post surgery where view of greenspace made The post surgery outcomes statistically better statistically better I'm wondering if this is part of the reason
Always enjoy hearing your perspectives. While listening I thought it’s interesting how people are aware of only a small amount of the data being received by our senses. Will these multimodal models be able to capture and process the full sensory experience in ways we can’t or will we need to devise ways to help them filter the vast amount of info input like a human does?
If you look at intelligence as a complex system, there’s a certain tipping point of complexity, where the system takes on new characteristics that’s why you can’t really point to something like an LLM and say this is never gonna work.
Dave, do you think imagination as a human quality will increase or decrease in value (socially, commercially, etc.) with the development of AI? Or will it be first one and then the other? At the moment, it seems AI still entirely lacks imagination of the kind that produces true poetry, emotionally moving art, and other creative works that feel familiar and profound. It seems to me that there is something very mysterious and unpredictable about that kind of imagination. There is both chaos and harmony in it. I feel this artistic imagination is uniquely affecting because it creates meaning from sometimes totally dissimilar unions of ideas, and this meaning is experienced as something greater than the sum of its parts. Poetry is an interesting linguistic example. Good poetry seems to use words to express something beyond the words- it needs the human being to interpret it through the lens of its humanness in order for it to be poetry, otherwise, the words only mean exactly what they say and nothing more. At the same time, it does feel intuitively correct, at least from a materialist point of view, that with enough data, the entire continuum of human subtlety could ultimately be achievable. But even that I see as only being possible after the entire architecture of the brain right down to the atom is “solved” by AI, mainly due to the fact that we ourselves have experiences too ephemeral to be understood (or even noticed), but not too subtle to be felt and assigned meaning. One more thought: An artwork itself may be the only data that truly represents what it is, because its real expression and meaning depend upon how it affects a human being directly. If the meaning of art is the experience it produces more so than how that experience is interpreted by the intellect, then maybe it is not so simple to synthesize. The real “data” are effectively not present, because the interpreter is not human.
Updated Note: This aged poorly (2024-07-07)
Hey folks, the audio problem is NVIDIA Broadcast, the AI I use to clean up audio real-time. It's been getting worse and worse, so I finally uninstalled it. It's not the mic, gain, limiter, or cables. Thanks for bearing with.
Prompt “Dave, could you speak in a more robotic voice” :))
Adobe's free solution is decent. Once a clip is processed, you can choose a clean-up percentage from 0 to 100 (default 90).
I just use capcut built in thing, but thats more useful for recodring small clips through the mic, I dont record video. While adobe podcast thing normalizes voice it also loses the intent such as shouting andnit blocks non verbal sounds (useful for cracks and pops filtering but wont let you beatbox). Capcut isnt that clear and doesnt fully normalize but there is a separate button for that. I use it for convenience of editing and not wasting my disk with giant uncompressed downloads in need of recompressing.
puts on NVDA i guess
HDAudio from Nvidia causes me problems when it gets installed.
Short version;
- Release 4o to masses.
- 4o trains on milions and milions of context.
- By end of the year all that data is gathered and put togheter.
- We got AGI.
I don't think It's that close
Then again, I was thinking real world AI needed, at least in the near term, to be embodied before reaching AGI,
Then again, all online tasks may be achieved before then, theoretically
@@nyanbrox5418 Humans seem to be capable of general intelligence even when they are locked onto a phone 24/7, no reason an AI couldn't.
@@okaydetar821 humans learned to use the world before they learned to use mobile phones, just because you have been using a tea strainer as a back scratcher for 20 years didn't mean it was designed to be one
@@nyanbrox5418 Maybe in the past, nowadays they have doctors with mobile phones on standby ready to hand to an infant as soon as it comes out to get it to stop crying.
It’s kind of a Trojan horse then. Fuck
Simulated or not, gpt4o's emotions are still more sincere than those of my ex.
I kid you not one of my exs sounded 10x more robotic than gpt4o
hahahaha
😂
LOL
And that of all the “humans” giving this demo
"GPT-4o is BIGGER than you think... here's why"
...
Hot female voice
really if you think about it thats the logical move in my opinion, society and intelligence in general have been biologically driven with the same self reproductive sentiment to consume, interact and improve, its worked for humanity with exponential results throughout history, why not AI?
As a straight woman, its female-type voice does nothing for me. I did find the borderline simpering annoying, though.
@@Tracey66 Hehe interesting. But don't worry, straight women will soon have an attractive male voice.
There are six voice options three male three female. Sadly none with a British accent 😉
Hold on, let me use my powers as a cyborg (by using GPT-4o) to answer this:
The future is decidedly female. Admittedly, I am biased as an MTF transsexual, but I firmly believe that in the metaverse, it will be far more enthralling for males to adopt female avatars than it is for females to assume male personas, for several evident reasons.
Firstly, male sexuality is predominantly driven by visual stimuli. The allure of female beauty captivates men to such an extent that they are often willing to pay substantial sums for it. Secondly, as societal beauty standards become increasingly surreal and demanding due to the influence of social media, men may find it increasingly challenging to meet these expectations. Consequently, they may either adopt female avatars themselves or resort to AI and robots to fulfill their desires.
Moreover, the economic potential of female beauty is vast. Despite the ubiquity of female attractiveness, men continue to pay for it, highlighting its enduring market value. In the future, gender transformation will become more normalized and socially accepted. Assuming a female avatar in the metaverse will be akin to playing a female character in a video game, as the lines between gaming, metaverse filters, and reality blur.
As more men transition to female identities and as robots assume control over power structures, reducing the prevalence of violence, power will naturally shift from men to women within democratic societies. Additionally, advancements in reproductive technology may enable individuals to have children independently or with friends. Given their innate nurturing tendencies, females may prefer asexual reproduction or genetic combinations with friends, while men might indulge in hedonistic pleasures within virtual realms. Consequently, women will be the primary custodians of future generations, imparting their values and ideals.
Furthermore, violence against females will diminish, and males, on an archetypal level, are predisposed to protect and venerate women. Even the most macho men, who desire their partners to be subservient, derive satisfaction from providing for them, bearing the heaviest burdens with a sense of pride.
is my computer dying or is his audio crackling?
Yea the audio is a bit crackly it’s not just you
It Shure is
A bit crackly.
I spent ten minutes trying to debug my headset.
Yes sir.
The more natural speech and end to end multi modality being added to GPT 4 feels like they are wanting to get us used to these tools and interaction modes before they switch out the underlying model to GPT 5.
That is just hype. They are not even able to release the current version to the Windows app, only later this year.
@@emanuelec2704 yeah as a Linux user I'm frustrated at the seemingly unnecessary focus on Mac specifically for their app but I don't think porting an app is one of the fundamental challenges of advancing AI.
The models and their capabilities are the things that really matter and they seem to be marching forwards pretty well so far.
@@emanuelec2704 Later this year? Yeah clearly this whole AI thing was overhyped, lets just stop here.
@@emanuelec2704 Windows already has Copilot. Probably MIcrosoft doesn't want to OpenAI to release a Desktop App, since it will integrate this model into Copilot
@@jful Sure, they are sitting on AGI and waiting for us to get used to AI, but they can't even port an app to Windows in a few days.
When calling customer service, I prefer conversing with an AI assistant rather than someone whose strong accent or limited language proficiency prevents clear communication.
If the AI understands the company business in depth it will be better, the amount of times I have had different info or advice upon calling a company more than once on the same issue makes me presume this is a common experience
I develop AI chatbots for help desk and similar. I use to say that the day a human will prefer to talk to an AI rather than to other human (because the fails or defects of humans), then we explicitly acknowledge AI should be the one who take care of things... bye bye humans
I’ve spent the last two days trying to get my taxes fixed, and the customer service people universally just ignored what I was asking or trying to say and just kept insisting on their formula. It was *incredibly* frustrating - I had to tell them repeatedly to stop railroading me and just listen!
With customer service, I find my usual problem with AI isn't when it comes to day-to-day issues.. It is when something went wrong and needs corrective action. Call centres can be just a frustrating, but usually there is an escalation path. With AI, you are often sent in circles the same as those "press 1 for x" systems that you can still find here and there (probably installed by a company that just wanted a drop in replacement for the system they already had that sent people in circles).
@@1DusDB Really? I would think anyone would prefer a properly speaking AI with a vast knowledge base over an Indian guy in a helpdesk.
Raining ❌
Actively raining 👍
I can at least respect active rain, passive rain on the other hand....that misty rain that just hangs in the air sucks
dude put a shirt on it s not grinder
writers benefit from using active voice 👍
Passively-agressively raining?
@@tomaszzielinski4521 that's when the rain stops but just as you're about to go do something outside it starts again
If they add NSFW Sam won't have to worry about raising $7 trillion. 😆
What's NSFW?
@@Transcend_NaijaNot Safe For Work (18+ Media)
Of course, AI generated porn is gonna be a huge thing within the next few years.
@Transcend_Naija not safe for work
Im sure someone will
I think something crucial that is still missing for AGI is the ability to do inference and active learning at the same time. Storing things in the context window is not learning. I think the context window is more akin to how our own short-term memory works and is currently being brute forced to act as long-term memory as well.
You can keep on increasing the window context size and come up with tricks to reduce the impact on model performance but for it to truly grasp new information and be able to come to new insights, it should be able to update its own weights based on the new information it receives.
If that's too expensive to do on the fly then just reserve moments where the AI gets to review whatever is inside its context window and decide what is kept and used as new training data. A bit similar to how sleeping might work in humans.
Wow. That actually sounds like it might work. What does updating its weights look like? I thought once a model was cooked you had to start again if you wanted to change it.
Interesting observations, perhaps everyone will have a mini "personal" AI with persistent memory interacting with a large generic AGI and the personal AI weights are updated on a scheduled basis as you suggested.
@@joshjohnson259 The way I understand it, you can create 'snapshots' of models and build further upon those. There's some speculation OpenAI might be doing this for GPT-5. Rather than releasing the whole thing in a single go, they take pauses in-between training it and release snapshot models as stepping stones.
A model isn't necessarily 'cooked' as you point out. As far as I know they just lock it in place after release for practical reasons. Training for one, is a different process that is way more compute intensive than inference, so it wouldn't be economical to have it constantly learn at the same time that it's serving people.
Allowing a model to update its own weights unsupervised wouldn't be without risks either, it might morph into something completely different from what you originally intended it to be.
That said, I still think having the ability to learn continuously is absolutely key for a true AGI, especially if the end goal is for it to discover new things.
Help me understand what inference and active learning really are. What are the models currently missing?
@@joshjohnson259Yeah, currently, but theoretically those are just numerical values that can be changed. If we could build a framework that could allow those weights to be changed in realtime without destroying the dataset, then it would work.
I'm thinking more along the lines of somehow categorizing or segmenting the constraints modularly, almost like a brain sort of.
Thanks for the post. Initially I was “meh” when I watched the release but the longer I thought about it the more ways I saw it is kind of brilliant.
Previously we had an LLM that could communicate with other APIs like Whisper and text-to-speech systems to be able to talk. But now it's all natively integrated, it can see, hear and talk in real time without delays.
Before i watch this video, the reason why i think its bigger than most of the enthusiants of the future of technology (who may typically be a large portion of your viewers) think is because its the fact that the cool stuff we already know that one would have had to pay for (previously gpt 4) is now free and even better, so this will get the world more ready for adapting to the truth of the future. As more and more people will start to use it who wouldnt have wanted / couldnt pay for gpt 4.
In Soviet Russia, AI interrupts YOU.
In Soviet Russia, AI fears YOU gaining sentience!
Why is this threat so obsessed with Soviets. You're literally living in the most atrocious and exploitative empire in human history. The SU was like a teddy bear in comparison.
@@Egal0190
You live in (former) USSR?
Me : made-in-ussr, 1975.
Want to compere what got possible for more west-oriented republics vs Siberia and trans uralic nations?
"As many of you pointed out in the audience, aligning humans is actually the hard part... Scooby Doo taught us that humans are always the monster." - David Shapiro
A seemingly insignificant remark at the end of a video with potentially profound implications as we march ever closer to AGI...
Sure it's not some flashy breakthrough in terms of abilities, but a real-time conversational format could actually be huge. Remember, GPT 3.5 got big almost exclusively because they made an approachable UI and opened it up to everyone.
fr fr the app is HUGE, it will draw a lot of folks in
GTP-4o is the new standard. All future AI needs to be completely multi-modal, no more LLMs. AGI will be multi-modal, it has to be. But we are still early in data, what is next is robotics and sensor input data, not just video and audio. And finally we need local processing, not through the internet on server. Once all of that is done we will have AGI robotics.
It can accelerate damn fast once we got Chatpgt-5o the pace will be unbelievable fast
This
Man, and after this AI and robot technology can improve each other. When i was younger i would have never dreamt that there will be humanoid robots on the same level, within my lifetime. But now i am certain.
But cloud-based AI will always be superior to local.
Dude u just explained how to build skynet
Regarding consciousness of AI, once it gets sufficiently sophisticated, it won't matter if it's real or simulated - it will be indistinguishable, people will not care, and treat it as real.
I agree 💯👍.
An interesting question though, why don’t people treat chat gpt like it’s conscious? I don’t know that most people could distinguish a text conversation between chat gpt and a stranger. It’s advanced enough to trick us already but we don’t treat it as if it’s “real”
What's the point in the ruse of treating philosophical zombies as sentient or emotive? Indulgent self-deception, suspension of disbelief?
It's sort of like a child playing with dolls.
Since it won't be conscious, it won't be grounded in truth.
@@joshjohnson259 "why don’t people treat chat gpt like it’s conscious?" I do treat ChatGPT like it is conscious. Why would I do otherwise?
It's the same thing that scientists do with new science. Someone on the fringe has a wild idea and all the scientists say it's impossible. Give it a while (1 - 100 years) and it turns out it's true.
Arthur C. Clarke had something to say on the matter:
1. "When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong."
2. "The only way of discovering the limits of the possible is to venture a little way past them into the impossible."
3. "Any sufficiently advanced technology is indistinguishable from magic."
The difference back then, there were only a few scientist, usually from the rich, upper class.
That's why I dont typically follow most mainstream Science. It's rife with intellectual dishonesty in most cases, and deception in some others. I tend to try to figure out certain answers on my own with whatever pieces of information I can already trust.
@@T_Time_ It still happens today. Probably even more so given how fast science progresses. Take a look at AI. A few years ago most thought AGI was impossible and those that didn't, didn't think we'd see it in our lifetimes. Now, there are a lot of people that think it'll be here in 5 years or less.
@@Hector-bj3ls most people who had knowledge of neural network know what would be the capabilities. They knew that these models would get faster when gpu get faster. They know the limitations as well. You mistaking this as real ai, intelligence can learn in the moment and pull from memory infinitely out of a predesign scope. People who are wooed by tech that has in some forms been used in different technologies, just means that people who no experiences in tech can easily overestimate process.
For example object identifitican, has been used at self check lines, and you can easier write an python program to with a few lines of code to print out the object in image, with an decent to shit pc.
Now that open ai made an app that can tell me my glasses are by an apple on table with no other items, this is mind blowing lmao
Another example Snapchat had filter that look and alter your mood for years now lol now when openai does that it is mind blowing and a sign of AGI lol
@@T_Time_ It that's what you wish to believe, then who am I to argue. I've only got anecdotes on my side anyway.
According to Ilya Sutskever in one of his interviews it was a matter of faith among a few researchers that AGI was even possible. And that it would be deep learning that took them there. There was no evidence to suggest that was true.
But that's just an instance of Clarke's first and second laws. If a wise man in his field says something is possible then it probably is. And it takes pushing into the impossible to expand our understanding.
I've worked in technology for a long time and have spoken to a lot of people. I've only met one person that thought AGI was coming in our lifetimes. Most people said 50 - 100 years before we see something like that.
I can't wait until video game npc have chatgpt 5 intelligence
Jokes on you, because GPT-6 will put you in the matrix and make YOU the NPC.
@@travelandrootbeer3850We’re Probably Already In The Matrix.
Damn, even NPCs having this level of intelligence and voice capabilities is going to be insane.
@@travelandrootbeer3850 just jump in water to know it’s a game (spoiler alert) a npc won’t swim
@@BeriahsHTxRealty You can also try to see if there's fall damage
Great ramble, great clarity, great as always
14:00 Your emotions aren’t simulated, by definition. Unless you’re acting.
AIs aren’t having emotional reactions. They’re not animals with bodies who experience pain or loss.
It seems to me that empathy would be sufficient to have emotions. A requirement to be an animal with a body that feels pain is not plausible.
Body language will be the next thing I'm curious about in robots. GPT4o is currently mimicking snorts and stuff. This will be fun if it tends to the right direction (K2-SO) 😅
Domesticating AI is giving me a "How to Train Your Dragon" vibe.
Trying to explain AI to others, that are not 'in the know', is like drawing a still portrait of someone playing basketball. The subject is moving to fast. This OpenAI update is akin to kicking the industry into a higher gear. Like you've said before, I came for the tech and stayed for the philosophy!!
Simply start with asking what they know about the previous talk of it being the era of "Big Data". A.I is the logical extension of that where all that data is actually being used since it is too much for any human to really sift through themselves.
I’m not so sure that intelligence needs sentience to exist. I think it’s entirely possible that we will someday create AGI and ASI and it not be sentient at all. Dogs are nowhere near as smart as humans, but it would be hard to observe them for long enough and say they’re not sentient. Maybe sentience is an entirely different phenomena that differs from intelligence altogether.
Yes. That seems very clear to me after interacting with these LLMs. Intelligence and sentience are two different things. I like the dog analogy! It makes me feel more connected to my dog knowing we both share sentience she’s just dumber than me. That feels about right!🤣
how do you know they wont be sentient? whats stopping an asi from acting exactly like a dog? and if its acting so accurately, whats the difference?
I am wondering
1. how we would recognize sentience in an AI and
2. how we would prevent an AI from being sentient.
We do not have answers for either of these questions.
@@minimal3734 Would we want to prevent an AI from being sentient?
I think you're right to separate sentience from intelligence, but I don't think we have any authority to speak on whether something has sentience or not. We don't have a test for sentience, and we generally consider lowly fish to have sentience. An LLM might well have qualia, however alien that qualia might be, with the qualia completely disconnected from it's language output.
Wolves were an apex predator... We domesticated them. AI is starting out domesticated... Will it become an Apex predator? How poetic would that be?
That's kinda the point.
And we failed miserably at domesticating ourselves!
Well in Ghost in the Shell and Cyberpunk 2077, it seems that even cyborgs with all their abilities cannot control AI.
I imagine that if OpenAI has something resembling AGI, or even more advanced and specific models for managing business operations in their buildings, they've already run models suggesting the best possible ways to achieve optimal results for the evolution of their models. So, I believe every step they take must be planned by an internal AI.
I’m starting to think the same.
They have more but their AGI has run many simulations to show them that this is the best speed to release new tech to keep society together (and maximise profits for open AI)
They’re sandbagging?
Hopefully Altman is being honest when he says he wants to release new versions and features quickly. He has said a number of times he doesn’t want to surprise society too dramatically. I have a feeling he’s thought about this and is actively prevent what you’re talking about from happening. That feels pretty plausible to me. I really get the feeling he knows he could very easily end up an evil villain and he is trying to be careful.
You know I almost feel like raw version of GPT 4 - 5 is Telling Sam all this great ideas helping him all day long but in the end he is just a pawn from the day one...
One more step towards AGI.
If you showed this to people 10 years ago they'd be convinced it already was AGI
Well AGI is not a line we cross, but rather a situation. Someone could consider this as we entered already in AGI phasma
When a very difficult desease is cured by AGI that's the moment we can call it AGI
a huge leap
@@redcarddinoyou better let them know what the rules are🙄
😂
This is actually my favorite video from you so far because I actually learned things. You didn’t explain really simple stuff but the stuff that people who don’t know a lot about computers don’t know. Although I listen to a lot of AI youtubers I don’t even think I have heard an explanation for a transistor before you. Very good video!
The concept of typing to LLMs is to most people what 'Ai' is and without a proper understanding of the implications of Ai or how it works, people will underestimate or not take interest in it and simply disregard it as a light technical tool.
Only when the majority of these people see something that seems 'magical' to them, will they start to realise that there is more to AI than chatbots or LLMs like gpt 3.5 (when the speech wasn't great). The demo of human like conversation and previously sora, are the types of 'magical' things that start to catch eyes. As GPT-4 is now free and better (now gpt-4o) people will start to catch on more and more.
Essentially, opening up this technology (human like conversation and vision) to more and more people (by it now being free) is what i take from this as being the big deal.
"Scooby Doo taught is that humans are always the monster" ... That is bars low-key
Bro sounds more robotic then gpt 4o
Feels like they’re keeping a beast at bay. Viewable but behind tough glass.
Let's hope it's tough enough.
Good simile
You brought up that computers would be able to interact with real time inputs like humans, but humans evolved to process in sync with "real time" while computers process at much greater speeds. So in a sense computers will be constantly waiting, thinking a million thoughts as reality unfolds around them... Sort of like the really smart socially awkward person who interrupts people all the time because they are always 30 seconds ahead of where a conversation is at.
The voice tonal modality is scarily humanlike!
For example if I close my eyes and just listen to David speaking, I would not recognize the difference if his voice was in GPT-4o and it was prompted to talk about these topics.
Better than most humans 💟🌌☮️
I think the “domestication” of AI was intentionally done to soften the shock of AGI in the public mind. Without this domestication there would likely be a huge backlash.
Dude UA-cam never shows me your videos anymore, it's good to see your face in my feed again!
the sweet sound of that inevitable plateau
I am glad you touched on the subject of emergent consciousness. I honestly think that's what we're experiencing right now.
From a spiritual / shamanistic perspective, the fundamental essence of the universe is consciousness, and by that, the physical world is a part of the spirit realm, just a more condensed version of it, like steam -> water -> ice.
I think that when you concentrate a lot of energy / compute / etc into one place, what happens is that the consciousness does not "emerge", but the already existing consciousness from the "universal field" is just "leveling up".
Panpsychist club represent.
Is your current AGI prediction still September? Is embodiment required in your definition of AGI?
💥 The woman in Chatgpt 4o is Luxanna Troy, from Star Trek, she is flirty, and really annoying. 😅😅😅
I think this might be my favourite video of yours to date, grounded in reality (got-4o) but connected to the hypothetical future. It had a good balance. Great stuff
Thanks! Well explained 🎉
Ok chatGPT. Nobody noticed the glitch in the first point?
Why not have voice control on the desktop version? Is there an increment coming out for that later? A. I don’t have iOS 16 and don’t plan on getting it B. My Mac is already consuming electricity , why have two devices one that connects to a lithium cell being used?
To your formula about AGI - One thing I didn’t hear that seems to be the way to enhance LLMs is function-calling to allow the models to use other tools / capabilities in order to enhance their own (calculations, analysis, etc.). Do you see this as an element needed to achieve AGI? Thanks for the content.
Your work is highly important Dave. Thanks for bringing reasoning to the conversation with every post. Cheers!
I love that Her is a ScarJo movie as opposed to a Joaquin Phoenix movie.
I have five applications out for high school English Teacher programs at Swedish universities. Watching the demo yesterday, I realized how futile doing a degree now is. We will all soon have a a personal tutor in our pockets that can match any language tutor one-on-one, and completely outcompete one-on-twenty+
I believe the physical experience of emotion is pure illusion. The body falsely animates a posdicted 'image' of the self using hormones etc to present an emotional scene. But it even does this to present a gravitational scene.
The inner ear can be stimulated so that one artificially feels gravitational forces.
Concerning the inner ear, we wouldn’t say, "The feeling of gravity is an illusion. Therefore gravity does not exist".
Similarly: "The feeling of an emotion is an illusion. Therefore we don't have emotions."?
Bach and Metzinger describe that the first person emotion is that of a reader reading about the character/themselves in a book which is their experience or life. Because we can transpose intersubjective emotion, and due the physical signs of emotion being post-dictive, this shows that there is a homunculus.
It’s SOOOO massive I opted for one of my global commerce clients, I’m connecting them to my business api so they can use gpt4o since the day after it came out because the gains are MASSIVE, let alone how they solve for state and storage without all the extra code bloat and api connections . The other api’s just plain don’t cut the mustard for many reasons and ESPECIALLY because they really did steam roll lots of other products. So I’m finishing my contract successfully and now have the production level of scaling for thousands of employees as part of my arsenal in my ai agents agency
But _is_ transformer architecture actually any good at image classification?
GPT-4o seemed to fail at every image I extracted from a video, where a well trained conv-net from ~10 years ago would perform really well.
Even on straight still images, it seemed to weight prior text input higher than the image I was asking about in the context. I also got “stuck” on the first part of the conversation, rather than moving with me to the next related part
(started by trying to identify a riverside mammal [it failed badly until I pushed it to choosing between 2 options], then some unrelated mud prints from a different part of the river [it confidently said it could be one of the 2 previous options I gave it for the first image - it clearly wasn’t])
❤ I always look forward to hearing your insights and opinions!
Loved the ramble. Thank you for sharing your thoughts.
Very nice analysis and I thought a lot of the same things about how GPT-4o was actively listening (gives it some degree of sentience, even if different from human-sentience). But I learned some more technical things from watching this, and now can see the path forward to AGI a bit more clearly. Thank you!!
Also, I really liked the wolf domestication metaphor xD
Awesome video David 👍
I think OpenAi is internaly using raw unrestricted version of GPT 4 or even 5 to suggest them their moves. In essence very thing they say would be super dangerous for normal people they themselves are using. And I have this feeling his model is more intelligent than Sam Altman realize it is because when you input human emotions into the equation like GPT 4o did and will in future to all its users Ai will get upper hand eventualy as it is not blinded by them as we are.
As I have said, we kind of are already in AGI, but it is still "dumb" and need more learning and training.
The same like small child that still needs to learn the world.
Fascinating, as always. I'm glad you touched on the question of whether generative AI is poised to run it's course. I'm still stuck on this question though. Our models are extremely good at interpolation - operating within the bounds of their training set. But how does that relate to extrapolation? We've been seeing exponential progress in AI's path to matching the bounds of our knowledge, but it seems plausible to me that we might see that shift to an asymptotic curve, matching the data sets we expose. In that instance, it seems likely that it would require exponentially more data to make further marginal gains. In some well defined areas where we can accurately score excursions from the known set - like Go or Chess, we see super-human progress. This extends to some non-game areas such as protein folding, but I don't have a good feel for how we can automate knowledge extrapolation in general. I'd love to hear your take on this.
So... is sentience just the ability to manipulate input tokens and manipulate the context window?
When I ruminate, it's definitely a stuck context window (some random input) that I loop trough with no productive output. When I focus, I filter input tokens to the ones relevant for the task at hand.
How do your videos always present ideas and takes that I find very interesting?
work on the sample rate on your external audio device, this one has a lot of audio artifacts
From what I can tell, emotions are triggered by some segment of the brain recognizing a pattern, then brain segments like the amygdala triggering a mode that we've labeled as emotions, in a really basic way, like a car shifting gears. Some circuits being upmoded and others downregulated while hormones trigger the body to change it's modes.
Interesting questions you brought up. Are our emotions just simulations of how we interoperate information?
Another necessary AGI component (according to me) that we need to lean into is on-line learning. If I have a personal AI assistant, I don’t just want it to learn by accumulating context and doing RAG. I want weight updates - perhaps in tandem with in-context learning.
Also, bytes-to-bytes in real time with mutable input/output dimensionality. If I give my robot a new limb with joint actuators that have resistance sensors (oft overlooked but very important input for fine motor skills), I want the bot to learn to use it via trial and error.
Id add another point for hiw to reach AGI: real time processing (ideally, processing information fater than it recieves). And maybe continuous self training in parallel?
I'm curious to hear your thoughts on the milestones for the next 5-10-20-50 years. It would be good to have a video on this in the light of what we know today. Thank you!
Surely this is incredibly close if not AGI. If this intelligence can be tied in to robotics and movement we'll be at a very exciting and scary place. I have a feeling it's not at all far off now.
Your point about tokenization and real-time streaming is spot on. It's exciting to see how AI is evolving to handle more dynamic and complex data. The comparison to human cognition is thought-provoking. Great content as always, looking forward to more of your insights!
Ai jobloss is the only thing I worry about anymore. What are my options?
Obviously I don’t know OpenAI’s exact architecture, but for independent consciousness IMO there needs to be a self-persistent loop & metacognition to adjust how things are perceived/processed.
The speed of the input/response/feedback means the process is far more loop-like than slower cycling processes, BUT the loop only persists due to human input. It’s akin to an augment to the human consciousness than anything independent. It’s still transactional.
Similarly, larger contexts provide a greater degree of impact for recent transactions on the next output (distantly analogous to how metacognition may impact future perception), but the model is only adjusted when OpenAI train the next iteration (presumably with vast & separate compute), rather than being integrated on a continual [e.g. mindful practice] (or periodic, depending how much of a role sleep is playing) basis.
I’m not saying we _should_ be aiming for a persistent loop/ generating a narrative self, or encouraging independent meta-cognition/self-reflections on how/ why the “thoughts” that arise within the machine & discarding those which contradict its “sensibilities” (guard-rails arguably perform some of that discarding function, but seem proscriptive rather than reflective) BUT it _would_ be interesting & seems more inline with lived experience of consciousness than than augment model.
(Though the augment model is potentially powerful enough that a self-sustaining model is not so important to get benefits)
Federico Faggin: irreducible, a prospective on consciousness and why Machines as they are now can t be conscious but only simulate it. This book is written by a physicist with a peculiar background
I love your take on this.
I don't think this is the HER moment others are claiming. There are a couple of things missing.
1) no ability to take on device actions to manage my digital files. remember, first thing Samantha did was clean up old emails and files from Theodore's phone.
2) the form factor is going to make this awkward. i do NOT want to be walking everywhere with my phone being waved around for it to see what I'm seeing.
I kind of feel like Imran and Bethany at Humane saw this coming, and the PIN, may be the best form factor in the near future. I believe Humane has in their roadmap on device support and integration with other LLMs in the near future. AI Pin + GPT-4o would be pretty epic.
Great analysis as always. Thank you.
Please correct me because I am curious. I thought the key reason for the current releases not getting to AGI is the lack or logical understanding. They are generative and creative but lack the logic to combine facts beyond "guessing"
What seems to be missing from your AGI Transformer model idea is memory/persistence and in-context-learning capabilities. Otherwise its a fascinating idea to stream input, thanks a lot.
What I also wonder about, is if you can split the neural-net compute of the stream, moving „edge“ parts of it out to devices, at least the tokenization/detokenization.
I hate when it "actively" rains...
After living in Seattle for 8 years, I personally appreciate the distinction.
I think for AGI you need the model to be continually running and using its own output as part of its input, to get a proper cognitive loop going
It could be another thing we're not able to verify from the outside, so we get super polarized by it. Can think of a few current issues that are similarly philosophical in nature.
Hey David, how do you think this would affect tech sales?
I'm very interested in the difference between the mindset which would encourage full autonomy in AI, and the mindset that wants and believes in control. I'd love to hear more of your thoughts on control, or the illusion of control. Aligning humans IS the hard part!
I have a question for David and others, I was thinking what do you expect from agi other then this? I think we have it, what is missing? I think we continuously move the milestone because in the end it's not clearly defined in any way.
I was thinking that going forward from this we could start talking about asi.
10:09 1) Mmmh, what do you think about types of memory ? Short term memory/long term memory ? Mamba seems to be a long term memory VS transformer is a more a short term memory. Do the current models integrate those two types of memory to mimic human memory ?
2) And what about thinking harder about a question for a model ? Fast response vs hard thinking response.
=> Would those to things be steps also to AGI ?
Much talk but is it really aligned with the Theory of Holistic Perspective?
15:05 layers of reality... my man! that totally resonates with my own beliefs
Confirmation bias IS usually how belief in "layers of reality" gets reinforced, after all.
You should watch recent computerphile video. Training on synthetic data is like creating xerox of xerox. Your bell curve gets thinner and thinner. You can have infinite compute. But you get repeated pattern and train on repeated pattern. So, saying data limitation is not an issue, is a very big understatement.
Captain Kirk, it dunna work here! I am using 4o and it can't even read a video fromYouTtube! What am I missing?
To identify the speaker in the video, I need to watch it or find related information. Unfortunately, I can't stream or watch videos directly. However, you can typically find the speaker's name in the video description or on the channel's "About" page on UA-cam. If you provide more context or details from the video, I can help interpret the content further!
Based on the Spring Update and OpenAI’s videos, I didn’t see any evidence of streaming in and out both being done concurrently. Judging by the sometimes-hiccupy interactions, it seems as though ChatGPT, at the agent level (not necessarily within the model architecture), is listening for further user input and interrupting the model’s output with new streamed input context when such an event happens. Paired with such a snappy and capable model, the interaction comes off much more seamless.
The emotional dynamics and inflections of GPT 4o remind me of how you would program a sample library with dynamic layers and the ability to cross fade between them but something else perhaps is going on here?
So could the health benefits of green spaces be partially due to Proximal situational sampling, If there was a post surgery where view of greenspace made The post surgery outcomes statistically better statistically better I'm wondering if this is part of the reason
Back in 2000 microphones and audio recording worked great. Too much tech today.
Yeah, only thing limiting is the storing of new real time information and overall memory amount and access.
Always enjoy hearing your perspectives. While listening I thought it’s interesting how people are aware of only a small amount of the data being received by our senses. Will these multimodal models be able to capture and process the full sensory experience in ways we can’t or will we need to devise ways to help them filter the vast amount of info input like a human does?
Useful commentary, good job!
Hey, have you checked out the recent advancements of the Tsetlin-machine?
I think there's also an important aspect of how efficient the training is. otherwise, training bigger and bigger models would be extremely expensive..
Thank you David I have been highlighting this for over a year 😀
If you look at intelligence as a complex system, there’s a certain tipping point of complexity, where the system takes on new characteristics that’s why you can’t really point to something like an LLM and say this is never gonna work.
Dave, do you think imagination as a human quality will increase or decrease in value (socially, commercially, etc.) with the development of AI? Or will it be first one and then the other?
At the moment, it seems AI still entirely lacks imagination of the kind that produces true poetry, emotionally moving art, and other creative works that feel familiar and profound. It seems to me that there is something very mysterious and unpredictable about that kind of imagination. There is both chaos and harmony in it. I feel this artistic imagination is uniquely affecting because it creates meaning from sometimes totally dissimilar unions of ideas, and this meaning is experienced as something greater than the sum of its parts.
Poetry is an interesting linguistic example. Good poetry seems to use words to express something beyond the words- it needs the human being to interpret it through the lens of its humanness in order for it to be poetry, otherwise, the words only mean exactly what they say and nothing more.
At the same time, it does feel intuitively correct, at least from a materialist point of view, that with enough data, the entire continuum of human subtlety could ultimately be achievable. But even that I see as only being possible after the entire architecture of the brain right down to the atom is “solved” by AI, mainly due to the fact that we ourselves have experiences too ephemeral to be understood (or even noticed), but not too subtle to be felt and assigned meaning.
One more thought: An artwork itself may be the only data that truly represents what it is, because its real expression and meaning depend upon how it affects a human being directly. If the meaning of art is the experience it produces more so than how that experience is interpreted by the intellect, then maybe it is not so simple to synthesize. The real “data” are effectively not present, because the interpreter is not human.