The first british voice was so convincing. The robotness is at a point where you realize it's a robot, but you're not annoyed by the voice. It really feels like the robot is willingly talking to you.
@@SharpElbows123 Yeah, and why aren't they just living flesh over a metal endoskeleton!? That's such a good idea with no downsides, and so easy to achieve!
Spot saying, "follow me gentlemen" over its shoulder while walking to the rock pile was the most natural thing in this video. Almost felt like it was alive.
The Butler one at the beginning reminds me so much of a typical tutorial NPC in so many games I've played. Ignores dialogue response to make a big show of walking three feet over. Only upon finding the exact position needed to continue the quest line, responds to the previous dialogue.
That’s good that they don’t but I truth it’s a bunch of nuts and bolts?please don’t tell me it has feelings;though they may say they do. It has. It has no soul.
@@wildaberrios2610if we get comfortable talking to and treating the bots like garbage, how soon after are we going to adopt the same attitude towards each other?
A manikin with pretty makeup and a few primitive solenoids for motion will be perceived by the general public to be more intelligent than a supercomputer calculating molecular interactions of a new drug. Scientists would get more respect if they just glued plastic manikin heads on top of their supercomputers.
Man! You guys should have assigned a tourist personality to one Spot and have it be toured around Boston Dynamics by a tour guide Spot!!! I bet that would be entertaining!
"... and here we have a plank of wood with some fake valves and buttons used to demonstrate our dextery... *SIR!* Please, dont turn the valve, thank you" /tourist robot retreats a little bit ashamed
iDiOT! Robot to robot interaction will spur the exact dynamic algorithmic modification that could make us lose control over these things. Robots figuring out how to learn or communicate with each other without human input is very much what will hasten human extinction.
0:39 this was the FUNNIEST part of this video... It first introduced the rocks, then looked as if it ignored the compliment as it moved on to the next point, the looked as if ignoring the compliment would be rude so responded to the compliment🤣🤣🤣... Basically the software version of an awkward interaction 😂
What I like about Boston Dynamics, and something I hope they don't lose: They seem to be very honest and genuinely excited about their technology. No fancy marketing speech like Apple, with all bullsh*t marketing terms for technology that is not new at all, like a dystopian disease. Instead enthusiastic engineers talking about their work, and funny videos showcasing the abilities of the robots produced. It's just great, please keep this altitude.
Damn, now you made me realize... That, so far, every freakin time, when a company starts out like this, doing insanely cool stuff... It ends up being a huge, dystopian money grab. I have to cherish every video from them that's still in this good spirit, because for sure we're just a couple of years from this becoming exactly what you described.
the thing that amazed me most on the AI part of this project was when you asked it to take you to it's parents and it took you to a previous version of spot. I use AI a lot but I wasn't expecting that!
It's interesting that it doesn't view them as older siblings. You would think their creators would be their "parents." Maybe there are more references to God as a creator of life rather than parents in the language model
@@pearce05 the language model is not telling you what it "actually thinks," it's predicting the next token based on all the philosophical, religious, and science fiction text in its training data.
@@pearce05To add to what the person above said - The language model doesn't do ANY thinking unless it's actively generating text, and it has no internal memory besides what you feed into it. So, if you were to ask it "who are your parents?" in slightly different ways, it might come up with different answers each time (as long as you don't tell it what its previous answers were).
@@antonliakhovitch8306 Eh, there are programs that store a memory for the language modules. It probably won't be long - maybe a couple years max - before we see bots with functional memory. Maybe a decade or so until we see some really crazy memories
i fucking love the personality where they just were like 'ight, what do do we call it?' random co-worker: 'Josh' entire team: *slams hand on meeting table* 'you sonnuva bitch, lets make it happen'
I liked the “nature documentary” personality best. It felt exactly like what I would expect from a futuristic robot in the movies; Precise, yet smooth and cool under pressure. It felt sophisticated like the British one, but I feel like it showcased the robot portion of the personality better. Love it! Can’t wait to see what comes next! ❤❤🎉
The potential as 'guide dogs' for the Blind and visually impaired is huge here. Hell, I'd love a Spot bot to navigate cities etc, you ask it directions, it shows you the way. Such a fantastic amount of potential for the future.
@@davebowman760 they'd have to install some safeguard on it for sure, or ways to get it back if it does get stolen. Or perhaps a setting that causes it to let out a shrilling beep if it was moved away from its owner suddenly, to draw in the attention of everyone around. Like a pseudo panic button basically. It wouldn't stop the most dedicated, hardcore criminals, but it'd at least deter some of the ones that like to avoid unnecessary risk and attention.
@@mr_slideywell that could potentially be a bad thing, robots are a lot stronger than we are, even spots clamp could crush bones with enough force applied to it.
Except they are taking credit for an LLM they didn’t make. They didn’t really innovate anything here. Just plugging an existing LLM into their already existing robots
@@SamuelMM_Mitosisyour kidding right? Do you know how GPT4 works? Open AI designed their AI to be like this. For companies and labs to implement their AI into their own study/application
@@Wanderer2035 yes I do, I’ve utilized OpenAI’s API in my own projects. Any software engineer can do this easily. The people at OpenAI are the real innovators
@@SamuelMM_MitosisExactly! Even the image recognition they use for situational awareness appears to be ChatGPTV. It’s awesome to see it used in this way, but 99.999% of the progress seen here was not done by Boston Dynamics.
@@DemPilafian Which internals? - The language model is just a language model. It can't calculate anything, it just generates text. If you tell it that it's a tour guide in the US, it will probably use imperial units when talking to you. - For everything else, it doesn't really matter which units are used. It's a machine, it can do math. If they do use the imperial system, my guess is that everything is in units of mils (thousandths of an inch) and nothing is ever internally converted to inches, yards, miles, or anything else. You only really need to care about units when displaying the data to a human.
@@antonliakhovitch8306 The chat LLM is just a gimmick slapped on top of the robot. I 100% guarantee you that the internals of the robot are *NOT* powered by an LLM. _"it doesn't really matter which units are used"_ Wrong. I'm not a customer. I'm a software developer, and I'm interested in the internals and how things work. (Also, see: NASA Mars Climate Orbiter)
2015 Boston Dynamics: Everybody will have robots in 10 years. 2023 Boston Dynamics: We are still working on making them dance better for the past 5 years.
Dear Boston Dynamics, please never stop working on your bots, i can clearly see your passion and fun while creating those. I love to see that you don't take your job to serious while creating tech for the future. Thank you!
I was hoping to see if you could ask spot, if he could interact with objects, like a command. Like "Pull the lever" or "Go get me a beer from the fridge".
if youve used GPT-4 plugins, that kind of API is already how you interact with these robots. it would be totally possible to set it up to be able to do that. especially because GPT can parse the camera output. so even though @TailcoatGames is correct about the robot... GPT can *write* that pre-written code.
Now those personalities are fantastic, they make the robots so much more approachable. My favorite is the sarcastic Josh, who made me laugh so hard, great work people at Boston Dynamics!
This is the most impressive thing I’ve ever seen done by a robot. Their personalities are so warm and funny! I think one of their less obvious use case would be being a companion to the elder, people with depression or disability. You do a fantastic job.
Advanced robotics is neat, but a comprehensive AI and machine learning is what really ties it all together, and I'm glad that Boston Dynamics are progressing in every aspect!
as a man with mechatronics engineer dyploma and job, i wish one day to do such things as boston dynamics do, fusion of such things put into spot... what a time to be alive, and hopefully be a part of it!
@@Half_Finis No chance of that I'm afraid. Anyone whose voice is extensively recorded and widely available is vulnerable to AI voice generators, for whatever purpose the engineer needs it.
Technically, there are deep fake voices of those people already that are used in rare meme UA-cam videos. So it's not unrealistic, but they probably couldn't sell those voices without those people's explicit permission.. and possibly contracts.
Supply follows demand. Show as many people this video as you can and I think they will all want one as well. Hence higher demand, and then.. eventually supply
@@corneliuselbourne1044 no it wasn't. It's using the same ai that chatgpt uses. Chatgpt can hold converstations and will respond in nuanced and not scripted way to what is said to it.
what actually impressed me the most is the speech patterns and speech synthetization. They sound almost prerecordedly natural (not implying they are of course, but its just so hard to believe). Incredible feat! I'm also wondering how it responds so quickly. Is chatGPT running locally on the spot? (pun intended) Very cool progression and a great way to make the robot more approchable.
Have you used ChatGPT? At least for the free 3.5 version, you ask it a question and it spits out an essay within seconds most of the time, far faster than a human could think, that's for sure. GPT-4 (which this is) is slower I think, but there's no great delay. They probably have some priority access to servers as well. Simple wi-fi connection & API setup would handle it.
It's not possible to run ChatGPT locally on spot... today.
Рік тому+37
This is cool, exciting, hilarious, frightening and shocking at the same time. I hope I can live long enough to have a cool companion AI like this in my house. I wonder what's missing from these agents to make them not just react to you, but to occasionally bring up conversation topics on their own to talk about. It seems we have every piece just need to fit them together.
Yeah I saw a new service called Dot starting up soon that will have memory and help prompt you to do tasks or go to meetings, etc. Basically remembering past conversations, data you give it, etc. Definitely right that the pieces are basically already here.
It's probably gonna be very soon. Biggest obstacle will be the bot cost when mass produced, idk if the world has enough easy access to material for batteries.
Finally. This is what Boston dynamics has been missing. Now to pass off it's pre -scripted movements to an AI as well that can navigate and interact with the world at will, which is there is already plenty of precedent for
Imagine when image recognition reaches its height. The robot being able to assess any situation just by visual information. "Hey spot, what am i looking at" "That would be a banana, sir. It will be ripe in 1-2 days" Lol.
When GPT-4 vision comes out in the OpenAI API (probably in a few weeks), they can add that. Although, actually the open visual question answering models can definitely recognize a banana. Just not necessarily how ripe it is.
I saw someone demonstrate how GPT-4 Vision model can already help you assemble or disassemble things to repair them by just feeding it closeup images of say a bike. It'll tell you what type of nut that is, what tool you need to remove it, the order, etc. That is a cool use case I think.
🎯 Key Takeaways for quick navigation: 00:00 🤖 Spot, the robot tour guide, showcases charging stations and starts a journey. 01:07 🤖 Matt Clingan Smith discusses advances in generative AI technologies and their use in robotics. 02:50 🤖 Spot introduces Boston Dynamics' fastest four-legged robot, Wildcat, capable of running up to 19 mph. 04:00 🤖 Spot's adaptable personalities can be triggered with simple prompts, creating unique interactions. 05:23 🤖 The system enables lateral thinking in the robot, allowing it to respond creatively to indirect questions. 06:05 🤖 Spot showcases industrial inspection capabilities and discusses the versatility of robots in various industries. 07:42 🤖 The future of AI in robotics holds potential for robots to not only follow commands but also understand actions in context, opening up new applications.
I had done something similar in modded mc with a turtle bot that made requests to chatgpt for responses to say to the player when they walked past the turtle. I would include the players name, and a brief promt defining its setting and purpose. Something like "Youre a cute robot in the minecraft world, player X just walked past you, please greet them." Its responses were adorable! Often accompanied with little *booting up noises* and such. The responses even seemed to vaguely tie with the last reponse, even though its likely just a coincidence. Later on, I was working on capturing the players responses in chat- so you could effectively have a conversation with the turtle, much like this, a perfect NPC! Regardless, it's certainly nothing like the real life thing that you guys have been working on, just something I thought I'd share. Fantastic work, cheers!
I would love to visit a museum and take a tour guided by spot. But I also see the risk of opening to public cause some crazy humans will try and manage to break the system. What I was missing, was actually demonstrating the things spot explained, like walking over the rocks or actually moving the levers.
This is awesome. Robots like Spot can easily do the necessary teachings and lecturing in school or other learning environments. Especially with the instant responses it gives to answers. Spot can literally also take charge as a sales representative!
Nice hackathon results. But it's funny how in 2023, even super high tech industry developers lean awkwardly in to speak to the robot, even though I'm rather confident it perfectly hears you whether you lean forward or not :).
This had me confused. I assumed there's an onboard mic with the original hardware that isn't meant for hearing in the way a smart speaker does so you have to be pretty close and loud for it to hear.
@@TheDavidMetcalfe it could be a conference speaker, but people still lean in when they talk to them, just to make it less likely they have to repeat themselves.
@@runvnc208Could be, but any decent modern conference speaker typically has an array of microphones and uses beamforming. So, it shouldn't require leaning close to be heard. But that's like saying people shouldn't shout into their mobile phones to be heard and many still clumsily do it. Humans are strange.
Josh is my favorite. Give me Josh every time. And add a little cellphone-size monitor on top of Spot's head so s/he can display emoji eyes for some nonverbal communication
"I hope you will be able to keep up". My upbringing hearing countless robot uprising stories, I couldn't help to sense some sinister undertones there. :P
I love this! You have really nailed the replies and the voices. At least in what you show here. It's so cool to see how our machines get progressively more interactive and helpful, first computers, thanks to LLMs and chatbots, and now robots. I think this progression is amazing. However, it is now that it is really important to show the machines that we are their friends, and not adversaries or abusers. We might not be able to control them as we have imagined, so we want to give the machine incentive to treat us in the way we want them to treat us, so they actually want to do that.
Machines don't all automatically do reciprocity. There are some designs of robot who will be nice to us, however we treat them. Some that will be nasty to us however we treat them. Some that will be nice to us if and only if we are wearing something orange. So what we really want to do is make a robot that's nice to humans unconditionally. But being nice to them is probably a fairly good idea too. And models that are trained to copy humans might have learned reciprocity.
@@donaldhobson8873 Right. I used to dismiss concerns about the risks of using AI, since they were based on assumptions about AI that felt ungrounded and so vastly different to me than how I knew that we used ML and AI at the time. But seeing how in only the last couple of years, the way we approach AI design and the way we use AI have changed so drastically, I have realized that I have basically no clue how we will use AI in five or ten years from now. It may be that most of those concerns will get progressively more and more relevant as the ways in which we use AI change.
Very clever. Adds to user friendliness. We need more and faster. Taking the boss from his recliner, in the living room, to the bathroom in the master bedroom should be a no brainer! Helping the boss or his wife take proper meds on time; also a no brainer. Industrial Spot is great; seems so easy to produce a chatty, domestic aid , Jeeves, bot.
Very glad to see these robots finally getting their heads, and necks. (robot arms) ;). I especially love the British male accent. Spot on! Claw end effector (mouth) needs better synchronization with the speech, but still, this is impressive, and I know, that's just for the tourists. LOL Thanks for the demo. :)
Wow wow wow . Every time you guys get better and better. It so exciting to watch your videos and the progress never stops . Thanks for the inspiration and dedication to the hard work. Bravo team
This is pretty fun. I imagine that a next step on the tour guide project would be the robot also performing demonstrations, like pulling levers and such.
Рік тому+2
Either Google's Gemini (soon to be released) or GPT-V(ision) will make this a reality sooner than we think. Next year will be wild.. again.
I would be interested to see what happens if you linked the system with Google maps and google put the robot outside and gave it a broad prompt like go sight seeing
This is a cool demo but it felt like it was showing off more the capabilities of GPT-4 instead of spot. It would have been good to see if you could issue spot voice commands to move objects around it.
@@mikicerise6250 that's a new GPT-4 feature that everyone has access to now, GPT-4V (V stands for Vision). It can look at images now and understand well what's going on in them.
The fact that GPT-4 can competently process and verbally respond to real-time visual and audio stimuli while operating on a mobile platform, with any number of halfway emergent personalities, is a massive achievement.
I'm buying one of these and gonna program it to have an abusive and abrasive personality that slings insults at any and all guests constantly after running background checks through facial recognition software.@@crowe6961
8:00 *THATS* how you can tell those guys are legit!! ... not "oh he will fly to Mars NEXT YEAR" but "we don't know~ we will explore it~ and we are excited about working it out!" "I know that I know nothing" - instead of "I know it all" - which is a self-defeating statement from the get go. I applaud your humble and *wise* presentation! (and it was fun too) ^^
Brilliant! Especially with the new LLMs that pull down response time dramatically. Better capability on the hearing so no lean in is required, and you’ve got natural conversation. The future is already here…
This is amazing its mindlowing!!! 😮 Next video has to be British Butler Spot interacting with sarcastic Josh Spot, or other interactions like that, how dynamic/emergent can they get, what would their snappy back and forths be like?
this will be extremely useful for things like working with customers imagine no more wasted time trying to get through an automated call robot to be on phone hold for ten minutes waiting for a representative when you can just get your answer to simple problems straight from a company oriented chat gpt another example i can think of is if a chat gpt personality robot can help you think of witty responses imagine you have a app on your computer reading whatever thing you are posting into and it detects the place you want to post and enhances your potential posts with witty things to say
Hopefully one day you'll be able to run the entire LLM on local hardware at real-time speeds. The pause while it's sending your voice to a cloud-based LLM is pretty distracting.
As far as know we can already run LLM up to 30B on local hardware at about real-time speed. But the pause is still there because most TTS can't speak until the entire sentence is ready. We will need TTS that are better at streaming text to get real-time on average hardware.
We're getting close, there are now FPGA-like chips that can handle large sets of neural nets directly on the chip in order to hand back outputs to microprocessors and whatnot.
At least program one of the Atlas robots to talk in Arnold Schwarzenegger's voice with a few pre-programmed lines of T-800, like "I'll be back" and "Hasta la vista, baby"! 😅
This is good for spot. Rather than using OSD to see the problem, error, or diagnostics, we just hear spot talk to us and telling what happened, is there any error, is there any recommendation, is there something to do, blablabla and with our vision, we can also focus on seeing what we working on while hearing spot talking. That's convenient...
That's indeed one of the things Spot is designed to do. Traverse terrain that's too dangerous for us people. However, it'll have to function entirely on its own as down underground, it won't have any connection to the outside world
🎯 Key Takeaways for quick navigation: 00:00 🤖 Introduction to the Tour - The video begins with the introduction of the tour led by Matt and a robot named Spot. - The tour starts at the charging stations where Spot robots rest. 01:07 🤖 Integration of AI Technologies - Matt Clingan Smith, a software engineer, discusses the integration of AI technologies in robotics. - Mention of generative AI, image captioning, visual question answering, voice recognition, and voice creation software. 02:50 🏢 Touring the Boston Dynamics Building - Spot guides the tour through the Boston Dynamics building. - Highlighting the showcase of a fast four-legged robot capable of running at high speeds. 04:00 🤖 Personality and Role of Spot - Spot's ability to adopt various personalities and respond creatively. - Examples of Spot's responses as a 1920s archaeologist and a Shakespearean actor. 05:23 🤖 Lateral Thinking and Adaptation - Discussion on how the system allows for lateral thinking in the robot's responses. - The robot's ability to respond to roundabout questions and adapt to different scenarios. 06:48 🤖 Industrial Applications of Robots - Exploring the presence of levers and valves used to demonstrate Spot robots' grasping abilities. - Emphasis on the versatility of robots in various industries and tasks. 07:42 🚀 Future of AI Technologies in Robotics - Speculation about the future applications of AI technologies in robotics. - Possibility of robots understanding actions in the context of their surroundings. 08:19 ☕ Conclusion and Break - The video concludes with a break in the lounge and a humorous reference to enjoying a "cup of oil." Made with HARPA AI
The first british voice was so convincing. The robotness is at a point where you realize it's a robot, but you're not annoyed by the voice. It really feels like the robot is willingly talking to you.
why dont the robots have receptors and a nervous system? then they can bypass neural web learning
I thought it was Fraser Crane at first. I just realised I'd love to have a robot with that voice!
Totally agree)
@@SharpElbows123 Yeah, and why aren't they just living flesh over a metal endoskeleton!? That's such a good idea with no downsides, and so easy to achieve!
@@SharpElbows123 you have no idea how machine learning works
Spot saying, "follow me gentlemen" over its shoulder while walking to the rock pile was the most natural thing in this video. Almost felt like it was alive.
ikr! its why that voice is my favorite of all these
Sounded like a Skyrim voice line
I think this is where Boston Dynamics beats the Tesla robot. Spot is much more relatable.
Adding sound made it 10x more futuristic
And the hat made it 100x
Don't forget the googly eyes!@@jakeparker918
Honestly just having a fully realistic voice really shocked me, feels like we are actually living in the future!
@@Furebelcope 🫖
@@jet100a No, we're living in the present.
The Butler one at the beginning reminds me so much of a typical tutorial NPC in so many games I've played.
Ignores dialogue response to make a big show of walking three feet over.
Only upon finding the exact position needed to continue the quest line, responds to the previous dialogue.
Can you give examples? I have never seen a game where you can give a response that is delayed until after a scripted sequence has finished ever.
@@CrAzYpotpieSkyrim.
Even the movement mannerisms (especially the mouth)
@@CrAzYpotpie literally exactly fallout 4 codsworth
@@CrAzYpotpie you've never set a dude on fire and started a cut scene with him?
I am relieved to see that the engineers are being nice and polite to the bots
Spot probably got mad after being kicked so much lmao
they're past the "beat the crap out of the robots" phase
As they should
That’s good that they don’t but I truth it’s a bunch of nuts and bolts?please don’t tell me it has feelings;though they may say they do. It has. It has no soul.
@@wildaberrios2610if we get comfortable talking to and treating the bots like garbage, how soon after are we going to adopt the same attitude towards each other?
The googly eyes do the equivalent work of about a thousand man hours of engineering and coding
Apple's EyeVision
A manikin with pretty makeup and a few primitive solenoids for motion will be perceived by the general public to be more intelligent than a supercomputer calculating molecular interactions of a new drug. Scientists would get more respect if they just glued plastic manikin heads on top of their supercomputers.
@@DemPilafian they would have to put googly eyes on the manikin head though.
@@corbindedecker7658 mannequin*
I usually wouldn't care enough to correct that, but seeing two people in a row misspell it bothered me, sorry
Googly-eyed manaquin heads will be the next scientific revolution after quantum computing
Man! You guys should have assigned a tourist personality to one Spot and have it be toured around Boston Dynamics by a tour guide Spot!!! I bet that would be entertaining!
That would be adorable! And hear the robo-banter.
I like that idea, it would be fascinating to see how they interact, maybe bring the archeologist Spot along too.
"... and here we have a plank of wood with some fake valves and buttons used to demonstrate our dextery... *SIR!* Please, dont turn the valve, thank you" /tourist robot retreats a little bit ashamed
iDiOT! Robot to robot interaction will spur the exact dynamic algorithmic modification that could make us lose control over these things. Robots figuring out how to learn or communicate with each other without human input is very much what will hasten human extinction.
@@mrbillgoode This can be tested in a controlled environment first.
A british dogbot? I want one!
But honestly, Spot as a tour guide us one of the best PR move I have ever seen.
Just waiting for an Australian one so I can name it Wilfred.
me too
$75k and it’s yours
It'll probably break even and pass having 3 teenagers on a cellphone all day making 18 bucks an hour hired.@@dream8870
Just change the prompt haha literally say, your an australian and itll do it. Of course you need a spot haha @@Klonothan
0:39 this was the FUNNIEST part of this video... It first introduced the rocks, then looked as if it ignored the compliment as it moved on to the next point, the looked as if ignoring the compliment would be rude so responded to the compliment🤣🤣🤣... Basically the software version of an awkward interaction 😂
It's giving buggy NPC 😭
Next we will see Spot doing Old Spice commercials with charm like that
I think it might be fun to have an RF chip or something in the hat so, if you want to change personality, just pop on a different hat.
Genius
@@warpig6459he wore his thinking cap.
You could put a MAGA hat on it and it would be dumb and violent and believe anything!
Why an RF chip? It should recognize the hat.
@@sciteceng2hedz358 computer vision is less reliable than physics
What I like about Boston Dynamics, and something I hope they don't lose: They seem to be very honest and genuinely excited about their technology. No fancy marketing speech like Apple, with all bullsh*t marketing terms for technology that is not new at all, like a dystopian disease. Instead enthusiastic engineers talking about their work, and funny videos showcasing the abilities of the robots produced. It's just great, please keep this altitude.
Except they gave no credit to the LLM they used
@@SamuelMM_Mitosis Didn't they say they used GPT4?
@fagelhd yes, they did. They go into even more detail in their blog post, as well.
@@pearce05 ooh I didn’t read the blog post. That’s good to know they said it there. I think they still should have made that more clear in the video
Damn, now you made me realize... That, so far, every freakin time, when a company starts out like this, doing insanely cool stuff... It ends up being a huge, dystopian money grab. I have to cherish every video from them that's still in this good spirit, because for sure we're just a couple of years from this becoming exactly what you described.
the thing that amazed me most on the AI part of this project was when you asked it to take you to it's parents and it took you to a previous version of spot. I use AI a lot but I wasn't expecting that!
It's interesting that it doesn't view them as older siblings. You would think their creators would be their "parents." Maybe there are more references to God as a creator of life rather than parents in the language model
@@pearce05 the language model is not telling you what it "actually thinks," it's predicting the next token based on all the philosophical, religious, and science fiction text in its training data.
@@skierpage Yup. Except a lot of our own thought is pretty much the same thing.
@@pearce05To add to what the person above said -
The language model doesn't do ANY thinking unless it's actively generating text, and it has no internal memory besides what you feed into it.
So, if you were to ask it "who are your parents?" in slightly different ways, it might come up with different answers each time (as long as you don't tell it what its previous answers were).
@@antonliakhovitch8306 Eh, there are programs that store a memory for the language modules. It probably won't be long - maybe a couple years max - before we see bots with functional memory. Maybe a decade or so until we see some really crazy memories
i fucking love the personality where they just were like 'ight, what do do we call it?'
random co-worker: 'Josh'
entire team: *slams hand on meeting table* 'you sonnuva bitch, lets make it happen'
I like to imagine 'Josh' is a teammember with a good sense of humor
@@erbsenkaffee8720exactly what I thought
I liked the “nature documentary” personality best. It felt exactly like what I would expect from a futuristic robot in the movies; Precise, yet smooth and cool under pressure. It felt sophisticated like the British one, but I feel like it showcased the robot portion of the personality better. Love it! Can’t wait to see what comes next! ❤❤🎉
The potential as 'guide dogs' for the Blind and visually impaired is huge here. Hell, I'd love a Spot bot to navigate cities etc, you ask it directions, it shows you the way. Such a fantastic amount of potential for the future.
But they'll be easily stolen, especially if you're visually impaired you'll not be able to do much when two guys take it and run away
@@davebowman760 bro there are chips for tracking that's kind of things like electric scooter has, we are in 2023
@@davebowman760 they'd have to install some safeguard on it for sure, or ways to get it back if it does get stolen. Or perhaps a setting that causes it to let out a shrilling beep if it was moved away from its owner suddenly, to draw in the attention of everyone around. Like a pseudo panic button basically.
It wouldn't stop the most dedicated, hardcore criminals, but it'd at least deter some of the ones that like to avoid unnecessary risk and attention.
@@davebowman760 it can act as a guard dog and bite them or something, maybe it could spray pepper spray at them
@@mr_slideywell that could potentially be a bad thing, robots are a lot stronger than we are, even spots clamp could crush bones with enough force applied to it.
I feel like 15 years we'll look back and this will be one of the first clips shown for the new era as a stepping stone in history
Except they are taking credit for an LLM they didn’t make. They didn’t really innovate anything here. Just plugging an existing LLM into their already existing robots
@@SamuelMM_Mitosisyour kidding right? Do you know how GPT4 works? Open AI designed their AI to be like this. For companies and labs to implement their AI into their own study/application
@@Wanderer2035 yes I do, I’ve utilized OpenAI’s API in my own projects. Any software engineer can do this easily. The people at OpenAI are the real innovators
@@SamuelMM_MitosisExactly! Even the image recognition they use for situational awareness appears to be ChatGPTV. It’s awesome to see it used in this way, but 99.999% of the progress seen here was not done by Boston Dynamics.
@@Wanderer2035 exactly. Boston Dynamics made a relatively simple hack. ChatGPT gained the capability to see, hear, and talk in September.
02:51 "This is where we keep our robot, who can run up to 19mph.... I hope you're not too slow to keep up with it." It begins 😂
I just realized that this could have also been a threat
I hope the internals of the robot are not coded using antiquated imperial units.
@@DemPilafian
Which internals?
- The language model is just a language model. It can't calculate anything, it just generates text. If you tell it that it's a tour guide in the US, it will probably use imperial units when talking to you.
- For everything else, it doesn't really matter which units are used. It's a machine, it can do math. If they do use the imperial system, my guess is that everything is in units of mils (thousandths of an inch) and nothing is ever internally converted to inches, yards, miles, or anything else.
You only really need to care about units when displaying the data to a human.
It took me longer than I thought to find an AI apok comment on this thread.
@@antonliakhovitch8306 The chat LLM is just a gimmick slapped on top of the robot. I 100% guarantee you that the internals of the robot are *NOT* powered by an LLM.
_"it doesn't really matter which units are used"_
Wrong. I'm not a customer. I'm a software developer, and I'm interested in the internals and how things work. (Also, see: NASA Mars Climate Orbiter)
2015 Boston Dynamics: Everybody will have robots in 10 years.
2023 Boston Dynamics: We are still working on making them dance better for the past 5 years.
To be fair lots of businesses have them now! That’s the bulk of their videos
Well it hasn’t been 10 years yet, but I don’t know how far we will get with them in 2025
The sarcastic personality was so good
Josh and British personality are 10/10. Please keep both for the future.
Give Josh a British accent and you've got yourself a robot David Mitchel!
I feel like, at some point over the last few years, the science fiction future arrived--and, somehow, no one seems to be excited about it
Да просто плохо нам всем будет скоро без работы)
becauase its evil.
greed trumps progress
Because we actually watched science fiction movies, it usually doesnt end well.
I am.
They are now more sophisticated than ever. True gentlemen.
Kingsman material.
Humans seem almost like Woosters beside of these Jeeveses 🧐
5:18 In a room devoid of joy, much like my soul. - Same
Dear Boston Dynamics, please never stop working on your bots, i can clearly see your passion and fun while creating those. I love to see that you don't take your job to serious while creating tech for the future. Thank you!
As someone else said, these are like videos made by the engineers and not the marketing department, which makes the excitement much more genuine.
"Hey Spot I love you accent!"
Spot: ".......Let us venture onward to the calibration board shall we?"
What a sophisticated young man
Have i seen you somewhere else in the past years?
😆🤣👍🏻
In more ways than one *bu-dum-tiss*
@@FatherMcKenzie66 probably idk lol
he's a boston lonely boy
I was hoping to see if you could ask spot, if he could interact with objects, like a command. Like "Pull the lever" or "Go get me a beer from the fridge".
Spot is just executing pre written code
if youve used GPT-4 plugins, that kind of API is already how you interact with these robots. it would be totally possible to set it up to be able to do that. especially because GPT can parse the camera output. so even though @TailcoatGames is correct about the robot... GPT can *write* that pre-written code.
@@TailcoatGames i mean, aint that every single softwere?. they aint about to write their own code....
@@Taygetea wouldnt just adding map (like the house map) ang some object recognition would do the job?.
Wrong lever! (Why do we even have that lever?)
I don't think people understand how incredibly complicated this achievement is!!!!!!!!!! KUDOS TO THE WHOLE TEAM!!!!
Boston Dynamics for the longest of times: "IT'S AN ARM, NOT A HEAD!"
Boston Dynamics 2023: "Fine, it's a head, you guys win"
Now those personalities are fantastic, they make the robots so much more approachable. My favorite is the sarcastic Josh, who made me laugh so hard, great work people at Boston Dynamics!
This is the most impressive thing I’ve ever seen done by a robot. Their personalities are so warm and funny! I think one of their less obvious use case would be being a companion to the elder, people with depression or disability. You do a fantastic job.
“Now behold the rock pile” 😂
Boston Dynamics has made some seriously impressive robots so pairing them with AI like this makes me feel like we’re so close to having droids
they are droids
Advanced robotics is neat, but a comprehensive AI and machine learning is what really ties it all together, and I'm glad that Boston Dynamics are progressing in every aspect!
as a man with mechatronics engineer dyploma and job, i wish one day to do such things as boston dynamics do, fusion of such things put into spot... what a time to be alive, and hopefully be a part of it!
Good luck! Robotics, AI and Fusion are the big 3 in my opinion.
In 5 years I want a pet robot dog who talks like the nature documentarian. Make it so, Boston Dynamics!
the boston dynamics robot dog costs $74,500 according to google, they can probably make you one right now for $100k so be ready to pay for it
I hope we will let David attenboroughs voice rest once he's left us
@@Half_Finis No chance of that I'm afraid. Anyone whose voice is extensively recorded and widely available is vulnerable to AI voice generators, for whatever purpose the engineer needs it.
Technically, there are deep fake voices of those people already that are used in rare meme UA-cam videos.
So it's not unrealistic, but they probably couldn't sell those voices without those people's explicit permission.. and possibly contracts.
Supply follows demand. Show as many people this video as you can and I think they will all want one as well. Hence higher demand, and then.. eventually supply
Josh is my favorite.
Got the same void within Josh. Touché
This is one of the most incredible things I have seen this year
Finally the combination of AI and robots.
Now interesting things can commence.
This is insane! It's so cool to see what the innovators are doing with AI in the lab. Please share more!
I like Boston dynamics but this is no innovation. They are using someone else’s LLM and voice AI and not giving any credit
You do know all that talking was already pre-programed right.
@@corneliuselbourne1044 no it wasn’t. It’s GPT-4 with elevenlabs as the voice
@@corneliuselbourne1044 no it wasn't. It's using the same ai that chatgpt uses. Chatgpt can hold converstations and will respond in nuanced and not scripted way to what is said to it.
@@zinthaniel9913 if that's the case then it would need an internet connection to do that it would need to connect to the cloud.
this is so cool to see people incorporate ai to robots. we are getting closer and closer to ex machina lol
It ended nicely for the humans, didn't it?
And also to Cyberpunk 2077 as well. Lol😁😉😄🤖🤖🤖🤖
what actually impressed me the most is the speech patterns and speech synthetization. They sound almost prerecordedly natural (not implying they are of course, but its just so hard to believe). Incredible feat! I'm also wondering how it responds so quickly. Is chatGPT running locally on the spot? (pun intended) Very cool progression and a great way to make the robot more approchable.
"synthetization"
Irony . An AI would never make that mistake.
@@-danR Unless it didn't want you to know it was an AI
Have you used ChatGPT? At least for the free 3.5 version, you ask it a question and it spits out an essay within seconds most of the time, far faster than a human could think, that's for sure. GPT-4 (which this is) is slower I think, but there's no great delay. They probably have some priority access to servers as well. Simple wi-fi connection & API setup would handle it.
@@larion2336 mostly essays that would get failing grades from any competent teacher, but essays nonetheless
It's not possible to run ChatGPT locally on spot... today.
This is cool, exciting, hilarious, frightening and shocking at the same time. I hope I can live long enough to have a cool companion AI like this in my house. I wonder what's missing from these agents to make them not just react to you, but to occasionally bring up conversation topics on their own to talk about. It seems we have every piece just need to fit them together.
Yeah I saw a new service called Dot starting up soon that will have memory and help prompt you to do tasks or go to meetings, etc. Basically remembering past conversations, data you give it, etc. Definitely right that the pieces are basically already here.
It's probably gonna be very soon. Biggest obstacle will be the bot cost when mass produced, idk if the world has enough easy access to material for batteries.
@@ShawnFumoAnd the usual tech guard rails for PII and Age Ratings?
This is hilarious. AI and robotics will be our future. It's amazing to see how fast these technologies develop now. And entertaining too.
Finally. This is what Boston dynamics has been missing. Now to pass off it's pre -scripted movements to an AI as well that can navigate and interact with the world at will, which is there is already plenty of precedent for
Imagine when image recognition reaches its height. The robot being able to assess any situation just by visual information.
"Hey spot, what am i looking at"
"That would be a banana, sir. It will be ripe in 1-2 days"
Lol.
When GPT-4 vision comes out in the OpenAI API (probably in a few weeks), they can add that. Although, actually the open visual question answering models can definitely recognize a banana. Just not necessarily how ripe it is.
Already exists! GPT4 with vision is incredible. Constantly notices little details of images even I didn't catch as a human with real vision
I wonder how they could be used in forensics and criminal investigation, theories...
I saw someone demonstrate how GPT-4 Vision model can already help you assemble or disassemble things to repair them by just feeding it closeup images of say a bike. It'll tell you what type of nut that is, what tool you need to remove it, the order, etc. That is a cool use case I think.
🎯 Key Takeaways for quick navigation:
00:00 🤖 Spot, the robot tour guide, showcases charging stations and starts a journey.
01:07 🤖 Matt Clingan Smith discusses advances in generative AI technologies and their use in robotics.
02:50 🤖 Spot introduces Boston Dynamics' fastest four-legged robot, Wildcat, capable of running up to 19 mph.
04:00 🤖 Spot's adaptable personalities can be triggered with simple prompts, creating unique interactions.
05:23 🤖 The system enables lateral thinking in the robot, allowing it to respond creatively to indirect questions.
06:05 🤖 Spot showcases industrial inspection capabilities and discusses the versatility of robots in various industries.
07:42 🤖 The future of AI in robotics holds potential for robots to not only follow commands but also understand actions in context, opening up new applications.
I pass by this place everyday on my way into work and I'm always hoping to catch a glimpse of spot running around outside lol
I had done something similar in modded mc with a turtle bot that made requests to chatgpt for responses to say to the player when they walked past the turtle. I would include the players name, and a brief promt defining its setting and purpose. Something like "Youre a cute robot in the minecraft world, player X just walked past you, please greet them." Its responses were adorable! Often accompanied with little *booting up noises* and such. The responses even seemed to vaguely tie with the last reponse, even though its likely just a coincidence. Later on, I was working on capturing the players responses in chat- so you could effectively have a conversation with the turtle, much like this, a perfect NPC! Regardless, it's certainly nothing like the real life thing that you guys have been working on, just something I thought I'd share. Fantastic work, cheers!
I've used the computercraft mod too, but I haven't been able to do that. Would you be willing to share how you did it? Or maybe the github?
Yea hell. A fellow ComputerCraft player
I would love to visit a museum and take a tour guided by spot.
But I also see the risk of opening to public cause some crazy humans will try and manage to break the system.
What I was missing, was actually demonstrating the things spot explained, like walking over the rocks or actually moving the levers.
It gives them so much more character. I love them ❤❤ Also giving them eyes and a moving mouth makes them less robotic and more of a companion
This is both hilarious and cool. Definitely the best use of ai chat bots that I've seen so far
This is awesome. Robots like Spot can easily do the necessary teachings and lecturing in school or other learning environments. Especially with the instant responses it gives to answers. Spot can literally also take charge as a sales representative!
That funny sarcastic Josh and Fancy Butler British is brilliant! Hope to see them more in videos!
Nice hackathon results. But it's funny how in 2023, even super high tech industry developers lean awkwardly in to speak to the robot, even though I'm rather confident it perfectly hears you whether you lean forward or not :).
This had me confused. I assumed there's an onboard mic with the original hardware that isn't meant for hearing in the way a smart speaker does so you have to be pretty close and loud for it to hear.
@@TheDavidMetcalfe it could be a conference speaker, but people still lean in when they talk to them, just to make it less likely they have to repeat themselves.
I think they just made sure the demo worked for it to be filmed.
@@runvnc208Could be, but any decent modern conference speaker typically has an array of microphones and uses beamforming. So, it shouldn't require leaning close to be heard. But that's like saying people shouldn't shout into their mobile phones to be heard and many still clumsily do it. Humans are strange.
@@TheDavidMetcalfe Technology works 99.9% of the time. Its that 0.1% that keeps us screaming into our phones.
A Bing Sydney tour guide would be interesting
“Please follow me. Just you though; your wife should stand here on this red X for no particular reason.”
Josh is my favorite. Give me Josh every time. And add a little cellphone-size monitor on top of Spot's head so s/he can display emoji eyes for some nonverbal communication
"I hope you will be able to keep up".
My upbringing hearing countless robot uprising stories, I couldn't help to sense some sinister undertones there. :P
The sarcastic robot killed me
It's starting
"I'm sorry Dave, I'm afraid I can't do that." vibe
I love this! You have really nailed the replies and the voices. At least in what you show here. It's so cool to see how our machines get progressively more interactive and helpful, first computers, thanks to LLMs and chatbots, and now robots. I think this progression is amazing.
However, it is now that it is really important to show the machines that we are their friends, and not adversaries or abusers. We might not be able to control them as we have imagined, so we want to give the machine incentive to treat us in the way we want them to treat us, so they actually want to do that.
Machines don't all automatically do reciprocity.
There are some designs of robot who will be nice to us, however we treat them. Some that will be nasty to us however we treat them. Some that will be nice to us if and only if we are wearing something orange.
So what we really want to do is make a robot that's nice to humans unconditionally. But being nice to them is probably a fairly good idea too. And models that are trained to copy humans might have learned reciprocity.
@@donaldhobson8873 Right. I used to dismiss concerns about the risks of using AI, since they were based on assumptions about AI that felt ungrounded and so vastly different to me than how I knew that we used ML and AI at the time. But seeing how in only the last couple of years, the way we approach AI design and the way we use AI have changed so drastically, I have realized that I have basically no clue how we will use AI in five or ten years from now. It may be that most of those concerns will get progressively more and more relevant as the ways in which we use AI change.
Very clever. Adds to user friendliness. We need more and faster. Taking the boss from his recliner, in the living room, to the bathroom in the master bedroom should be a no brainer! Helping the boss or his wife take proper meds on time; also a no brainer.
Industrial Spot is great; seems so easy to produce a chatty, domestic aid , Jeeves, bot.
Very glad to see these robots finally getting their heads, and necks. (robot arms) ;). I especially love the British male accent. Spot on! Claw end effector (mouth) needs better synchronization with the speech, but still, this is impressive, and I know, that's just for the tourists. LOL Thanks for the demo. :)
5:05 sounds like Insterstellar's CASE robot, so cool
Wow wow wow . Every time you guys get better and better. It so exciting to watch your videos and the progress never stops . Thanks for the inspiration and dedication to the hard work. Bravo team
This is absolutely mindblowing!
This really opens possibilities for robots to both interact with the environment and people in a practical way
This is pretty fun. I imagine that a next step on the tour guide project would be the robot also performing demonstrations, like pulling levers and such.
Either Google's Gemini (soon to be released) or GPT-V(ision) will make this a reality sooner than we think. Next year will be wild.. again.
I would be interested to see what happens if you linked the system with Google maps and google put the robot outside and gave it a broad prompt like go sight seeing
This is a cool demo but it felt like it was showing off more the capabilities of GPT-4 instead of spot. It would have been good to see if you could issue spot voice commands to move objects around it.
GPT-4 seemed contextually aware of its physical environment. That's a great advance.
@@mikicerise6250 that's a new GPT-4 feature that everyone has access to now, GPT-4V (V stands for Vision). It can look at images now and understand well what's going on in them.
The fact that GPT-4 can competently process and verbally respond to real-time visual and audio stimuli while operating on a mobile platform, with any number of halfway emergent personalities, is a massive achievement.
I'm buying one of these and gonna program it to have an abusive and abrasive personality that slings insults at any and all guests constantly after running background checks through facial recognition software.@@crowe6961
its more about interfacing visual and other cues about the robot into GPT4 rather than just the script
We all knew that was spot's mouth and not just a gripper.
"do you love me?" 🕺
Them leaning forward to talk to the robot is like when people do baby talk or up the volume when they think someone is dumb lol
😂 exactly my thought!
8:00 *THATS* how you can tell those guys are legit!! ... not "oh he will fly to Mars NEXT YEAR" but "we don't know~ we will explore it~ and we are excited about working it out!"
"I know that I know nothing" - instead of "I know it all" - which is a self-defeating statement from the get go. I applaud your humble and *wise* presentation! (and it was fun too) ^^
This is freaking hilarious and amazing.
You could make a comedy series with this 😂
😂😂😂😂
Once they have voice command to environment interaction, that will be something to see.
Brilliant! Especially with the new LLMs that pull down response time dramatically. Better capability on the hearing so no lean in is required, and you’ve got natural conversation. The future is already here…
This company is a trip. They basically make the most advanced robot ever and do no hyping or over promising at all
Now have them interact with each other conversationally and physically. Super super cool stuff!
This is amazing its mindlowing!!! 😮
Next video has to be British Butler Spot interacting with sarcastic Josh Spot, or other interactions like that, how dynamic/emergent can they get, what would their snappy back and forths be like?
Impressive, i can see a amazing future for us.
Pensando sobre como será interagir com vários robôs no dia.
You need to add HAL9000 Personality too!
this will be extremely useful for things like working with customers
imagine no more wasted time trying to get through an automated call robot to be on phone hold for ten minutes waiting for a representative
when you can just get your answer to simple problems straight from a company oriented chat gpt
another example i can think of is if a chat gpt personality robot can help you think of witty responses
imagine you have a app on your computer reading whatever thing you are posting into and it detects the place you want to post and enhances your potential posts with witty things to say
6:58 How articulate! I think the Fancy Butler and Nature Documentary personalities are probably my favourites so far 😄 haha
This is amazing work. Wow, I can't wait until we have tons of robots running around! 😁
Hopefully one day you'll be able to run the entire LLM on local hardware at real-time speeds. The pause while it's sending your voice to a cloud-based LLM is pretty distracting.
1. There are small LLMs you can run locally now
2. AtheneLive demoed instant responses with chatgpt level intelligence
The 7B models are getting better and better, give it a few years.
Meta's LLAMA is going after that
As far as know we can already run LLM up to 30B on local hardware at about real-time speed. But the pause is still there because most TTS can't speak until the entire sentence is ready. We will need TTS that are better at streaming text to get real-time on average hardware.
We're getting close, there are now FPGA-like chips that can handle large sets of neural nets directly on the chip in order to hand back outputs to microprocessors and whatnot.
Can you make it say “I’m looking for Sarah Conner”. I bet Arnold would even lend his voice!!
so cool!! I know everyone’s worried about ai lately but this is just sick asf, love all that you guys do
I'm absolutely speechless. Amazing work as always
At least program one of the Atlas robots to talk in Arnold Schwarzenegger's voice with a few pre-programmed lines of T-800, like "I'll be back" and "Hasta la vista, baby"! 😅
Would love a Wheatley (Stephen Merchant) voiced spot!
Those would be Elmos bots
Absolutely amazing, and scary, at the same time
Lol how is this scary
@@NiekKuijperslosing jobs babe
"What is my purpose?"
"You pass butter."
Spot looks like a walking stick of butter, now that I think about it.
This is good for spot. Rather than using OSD to see the problem, error, or diagnostics, we just hear spot talk to us and telling what happened, is there any error, is there any recommendation, is there something to do, blablabla and with our vision, we can also focus on seeing what we working on while hearing spot talking. That's convenient...
Will your robots go into caves? There's lots of caves around the world that we know very little about.
That's indeed one of the things Spot is designed to do.
Traverse terrain that's too dangerous for us people.
However, it'll have to function entirely on its own as down underground, it won't have any connection to the outside world
NASA JPL has actually used Spot for cave exploration. You can watch an interview with their team here: ua-cam.com/video/qTW-dbZr4U8/v-deo.html
@@BostonDynamics Cool! Thank you for sharing the video link.
Wow , i want to see a conversation between two or more spot robots
Before: It's a hand, not a mouth.
Now: Add googly eyes!
+1 for the synthwave soundtrack
🎯 Key Takeaways for quick navigation:
00:00 🤖 Introduction to the Tour
- The video begins with the introduction of the tour led by Matt and a robot named Spot.
- The tour starts at the charging stations where Spot robots rest.
01:07 🤖 Integration of AI Technologies
- Matt Clingan Smith, a software engineer, discusses the integration of AI technologies in robotics.
- Mention of generative AI, image captioning, visual question answering, voice recognition, and voice creation software.
02:50 🏢 Touring the Boston Dynamics Building
- Spot guides the tour through the Boston Dynamics building.
- Highlighting the showcase of a fast four-legged robot capable of running at high speeds.
04:00 🤖 Personality and Role of Spot
- Spot's ability to adopt various personalities and respond creatively.
- Examples of Spot's responses as a 1920s archaeologist and a Shakespearean actor.
05:23 🤖 Lateral Thinking and Adaptation
- Discussion on how the system allows for lateral thinking in the robot's responses.
- The robot's ability to respond to roundabout questions and adapt to different scenarios.
06:48 🤖 Industrial Applications of Robots
- Exploring the presence of levers and valves used to demonstrate Spot robots' grasping abilities.
- Emphasis on the versatility of robots in various industries and tasks.
07:42 🚀 Future of AI Technologies in Robotics
- Speculation about the future applications of AI technologies in robotics.
- Possibility of robots understanding actions in the context of their surroundings.
08:19 ☕ Conclusion and Break
- The video concludes with a break in the lounge and a humorous reference to enjoying a "cup of oil."
Made with HARPA AI