38:00 During development you only need about the same number of bots as you have teams. Extra bots that are not in use are wasted resources since their hardware is not finalized. During development you want your fabricators working on upgrades and new designs.
Great news about your mom!! I'm glad she will be with you to see many more incredible things. As for the coming robots... wow! Things are getting really crazy good crazy fast.
As FSD gets closer to final reality, the question I have seen no one go into yet in any detail is how the autonomous vehicles will interact with humans, including verbally, and visually. Our infrastructure is set up for 99.9% of the time drivers/vehicles just follow the rules of the road, reacting as needed to traffic, signals, and other traffic. However, we all know there are times when emergency, responders construction workers, and other folks in their own cars are communicating with us as drivers. How will FSd handle these situations?
You two have to talk about how the robot acquires context. The robot assumes the German flag is “Germany” and not the country. How does the robot parse a statement as to what the human has asked? Japanese language is context dependent rather than using grammatical elements to clue the listener as to what, where and when is meant.
Tesla vision could tell chatgpt what it sees and what it intentions are, and chatgpt could tell the car what to do. this kind of verbalization is actually what I do in complicated traffic situations.
hi I‘m standing with my car at an intersection of two roads. both roads are two lanes per direction, i.e. 4 lanes total. both directions have traffic lights, mine are red. I intend to do a left turn. there are numerous cars on the opposing side of the intersection. they have red light too. the crossing traffic and my pedestrians have green light. write a detailed instruction how to proceed with my plan.
Sure, I can guide you on how to make a left turn at this intersection. Please keep in mind that local traffic laws and customs may vary, so always follow any signs, signals, and laws that are applicable in your area. Here's a general guide on how to proceed: 1. Positioning: You should already be in the farthest left lane. Your car should be positioned slightly to the right within the lane (to avoid any oncoming vehicles), but not so much that it encroaches into the next lane. Your wheels should be straight to prevent a bad outcome if your car gets hit from behind. 😮 2. Signal: Keep your left turn signal on to inform other drivers about your intent to turn left. 3. Wait: While the light is red, you must wait behind the line at the intersection. Keep an eye on the traffic light and also on the oncoming traffic. 4. Green Light: When your light turns green, if it's a green arrow, you can proceed with the left turn, ensuring there are no pedestrians crossing. If it's a general green light (not an arrow), you should move forward to the middle of the intersection, but you must yield to oncoming traffic until it is safe to turn. In the intersection, keep your wheels straight while waiting. 5. Safe Gap: Look for a safe gap in the oncoming traffic. Do not rush. The gap should be large enough for you to comfortably complete your turn without causing oncoming traffic to slow down or swerve. 6. Turning: Once there is a safe gap, check once more for pedestrians, then turn left. Avoid cutting the corner close or making a wide turn that brings you into the path of vehicles in other lanes. You should aim to enter the lane that is just to the right of the median (or the leftmost lane in countries where driving is on the right side of the road). 7. Completion: Once you have completed your turn, if your final desired lane is not the one you turned into, use your signals to change lanes safely, ensuring there's no vehicle in your blind spot. Please remember, it's crucial to remain alert and cautious throughout the entire process. Safety should always be the top priority when driving.
The Robot has a "world" of at most a radius of arms reach. I expect there is an early filter that reduces its context to anything within that radius. That filter would make a tremendous reduction in the variables that the robot is dealing with. Germany the country is not within reach so it is ignored, google vision recognition would "easily" recognize the flag as Geman and this is what the robot is responding to.
40:00 You just provided one of my concerns about upgrading my Model Y to get version 4 hardware. To take full advantage of FSD changes for even 5 years running may require version 5, 6, or 7 hardware. Right now even the sensors and their layout are still in flux.
We are seeing Tesla moving to make FSD into a full virtual world AI model that can interface with everything from our cars to human form robots. It's an order of magnatude bigger than most of even those in the AI field have grasp yet. This will let all our devices have the ability to interact with world. We are truly seeing a Alpha Go moment as so many areas of AI are reaching a point of development that allows them to become one unified system. Once this is done we will see it grow and learn so quickly people won't believe it. I suspect that Elon has been watching and waiting till he saw that the time was right for this to happen. Much of what we are seeing was already there in his mind just waiting for him to see the signs. You wonder just what Elon is really pondering and how Tesla, Twitter, FSD, Optimus, and Dojo will finally come together. Eventually one system for all of their needs could be developed. It would recognize what system it was controlling and just provide what it needs to function. That would result in one mothership to maintain and one AI system operating using one virtual copy of the world that uses vision and converses and understands the world through a large language model. These next few years will see disruption on a scale never seen before.
For the rock as a hammer scenario, I would actually have like a glass bottle in the mix, to see what kind of variables it would decide on. Because you could technically use a glass bottle to hammer in a nail, there is just a certain percentage risk of it shattering.
DM's robotic extension is limited to motion on a flat plane. O, probably, has already been fitted with sensors that will allow it to go from it's shuffling gait to a steppage gait while being safe enough, despite it's weight, that it won't crush an egg, a stair, or a foot.
Deep Mind's (DM) robotics is way behind FSD and Optimus (O). DM's hand is straight out of Lost in Space, it can't grasp, and can't tell when it is crushing what it is holding. O's hand is humanoid. It can be used to sort complex shapes into different boxes after a single traing example, and can pick up a ripe mango without crushing it.
Hi dr know it all I’m in Australia Can’t get an X or S If Tesla goes to wire steering maybe also breaking and acceleration the same , could that mean that car drives could be altered left to right very easily ?
Google was smart to use Facebook to train DM how to move. Tesla can do the same and can take full advantage of the humanoid form and mobilty demonstrated in those videos.
I'm from Jersey, During College (Early 80s) I worked part time for a Company called 'ORS' (Object Recognition Systems) in Princeton. I tried to find them and don't know where they went. Maybe John knows who I am talking about. I wirewrapped/PCB design and helped with programming Robotic Arms. They were going to hire me, but I went to IBM instead.
FSD probably can't be solved using a single AI model. We should be envisioning FSD as at least the conjunction of a 4D real world AI, a navigation route planning AI, ... AND a supervisor AI to control the system. "A ring to control them all."
28:10 Those tasks take a lot of precision and presumed knowledge of the objects. Watch a toddler try to perform them. Try to pickup and shoot a basketball filled with lead.
The robot is mostly static and is responding to comparatively simple commands - although with good grip control and a decent visual system (although relying on lidar for ranging which is a bit of a cheat). Tesla's Optimus is intended to be fully mobile and able to navigate its environment just using passive visual receptors, while responding to natural language requests. What this robot is doing is an impressive party trick but a whole order of magnitude easier to achieve than what Tesla is trying to accomplish.
Consider Tesla's data collection to be the ChatGPT dataset. You can build any sentence using the ChatGPT dataset. With a large enough FSD dataset you can simulate driving trough any scenario. You need that dataset to build the algorithm that will let you drive through almost every scenario using a relatively inexpensive computer that will fit under your seat and won't drain the battery before you arrive at your destination.
DM is not following commands. DM is self interpreting language input and creating its own solutions. The solutions shown have numerous errors and as a biomedical enginner I see no evidence of any safety engineering. It is much more difficult to do tasks safely than it is to step through a newly derived task. I would like see the failures. What if the cup was a hand or the chips were a container of poisonous material? There is a good reason why O's gait changed from steppage to a shuffle. I'd like to see the trash can of O crushed friut.
One of the earliest AIs, SHRDLU created by Terry Winograd at MIT in 1968, instructed a robot arm in natural language. We've obviously come a long way since then, but this AI doesn't seem remarkably innovative to me. Natural language processing has progressed remarkably. The robotics illustrated here doesn't encourage me to expect humanoid robots any time soon.
The failure of this approach is that it requires a super computer on the web to assist it's actions. The same will be true of Tesla's initial efforts. This will never scale due to expense
Supercomputers are needed for the modeling and learning. Much smaller computers can be used for real world applications of the derived algorithms. A somewhat poor example: You need a large factory to build 4680 battery. You don't need the factory to use the battery.
If Google Home is any indication of their ai, then it's a disaster, or they just don’t care about their customers. It has deteriorated, they stopped working on what they sold customers. It is basically a $100 kitchen timer. Lol. They lost me as a customer, which is sad so early in their journey. Sorry for the rant, but their integrity is shot.
This resembles "SHRDLU, a natural language processing system created by Terry Winograd in 1970." To clarify, an example SHRDLU dialog might consist of the following: Person: PICK UP A BIG RED BLOCK. Computer: OK. Person: GRASP THE PYRAMID. Computer: I DON'T UNDERSTAND WHICH PYRAMID YOU MEAN. Person (changing their mind): FIND A BLOCK WHICH IS TALLER THAN THE ONE YOU ARE HOLDING AND PUT IT INTO THE BOX. Computer: BY "IT", I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING. Computer: OK. Person: WHAT DOES THE BOX CONTAIN? Computer: THE BLUE PYRAMID AND THE BLUE BLOCK. Person: WHAT IS THE PYRAMID SUPPORTED BY? Computer: THE BOX. Person: HOW MANY BLOCKS ARE NOT IN THE BOX? Computer: FOUR OF THEM. Person: IS AT LEAST ONE OF THEM NARROWER THAN THE ONE WHICH I TOLD YOU TO PICK UP? Computer: YES, THE RED CUBE. (wikipedia) The SHRDLU research was exciting because it allowed the user, albeit in highly constrained circumstances, to communicate directly with the computer in English, rather than having to learn a machine programming language. projects.csail.mit.edu/films/aifilms/AIFilms.html
@47:53 Not *Blade Runner.* Go back a few more years to *Damnation Alley* by Roger Zelazny. (†) Come to think of it, the vehicle looks like a *CyberTruck* _on steroids._ †) Film adaptation: [ en.wikipedia.org/wiki/Damnation_Alley_(film) ]
38:00 During development you only need about the same number of bots as you have teams. Extra bots that are not in use are wasted resources since their hardware is not finalized. During development you want your fabricators working on upgrades and new designs.
Great news about your mom!! I'm glad she will be with you to see many more incredible things. As for the coming robots... wow! Things are getting really crazy good crazy fast.
As FSD gets closer to final reality, the question I have seen no one go into yet in any detail is how the autonomous vehicles will interact with humans, including verbally, and visually. Our infrastructure is set up for 99.9% of the time drivers/vehicles just follow the rules of the road, reacting as needed to traffic, signals, and other traffic. However, we all know there are times when emergency, responders construction workers, and other folks in their own cars are communicating with us as drivers. How will FSd handle these situations?
You two have to talk about how the robot acquires context. The robot assumes the German flag is “Germany” and not the country. How does the robot parse a statement as to what the human has asked? Japanese language is context dependent rather than using grammatical elements to clue the listener as to what, where and when is meant.
Tesla vision could tell chatgpt what it sees and what it intentions are, and chatgpt could tell the car what to do. this kind of verbalization is actually what I do in complicated traffic situations.
hi I‘m standing with my car at an intersection of two roads. both roads are two lanes per direction, i.e. 4 lanes total. both directions have traffic lights, mine are red. I intend to do a left turn. there are numerous cars on the opposing side of the intersection. they have red light too. the crossing traffic and my pedestrians have green light. write a detailed instruction how to proceed with my plan.
Sure, I can guide you on how to make a left turn at this intersection. Please keep in mind that local traffic laws and customs may vary, so always follow any signs, signals, and laws that are applicable in your area. Here's a general guide on how to proceed:
1. Positioning: You should already be in the farthest left lane. Your car should be positioned slightly to the right within the lane (to avoid any oncoming vehicles), but not so much that it encroaches into the next lane. Your wheels should be straight to prevent a bad outcome if your car gets hit from behind. 😮
2. Signal: Keep your left turn signal on to inform other drivers about your intent to turn left.
3. Wait: While the light is red, you must wait behind the line at the intersection. Keep an eye on the traffic light and also on the oncoming traffic.
4. Green Light: When your light turns green, if it's a green arrow, you can proceed with the left turn, ensuring there are no pedestrians crossing. If it's a general green light (not an arrow), you should move forward to the middle of the intersection, but you must yield to oncoming traffic until it is safe to turn. In the intersection, keep your wheels straight while waiting.
5. Safe Gap: Look for a safe gap in the oncoming traffic. Do not rush. The gap should be large enough for you to comfortably complete your turn without causing oncoming traffic to slow down or swerve.
6. Turning: Once there is a safe gap, check once more for pedestrians, then turn left. Avoid cutting the corner close or making a wide turn that brings you into the path of vehicles in other lanes. You should aim to enter the lane that is just to the right of the median (or the leftmost lane in countries where driving is on the right side of the road).
7. Completion: Once you have completed your turn, if your final desired lane is not the one you turned into, use your signals to change lanes safely, ensuring there's no vehicle in your blind spot.
Please remember, it's crucial to remain alert and cautious throughout the entire process. Safety should always be the top priority when driving.
@@Robert...Schrey that’s very good piece of writing, it really does sound like what it would write
“ maybe Elon‘s ultimate genius is he is not subject to the sunk cost fallacy” that’s a very interesting point John!🤔
Great video. Keep it up. Thanks for including the Facon Heavy outtake.
The Robot has a "world" of at most a radius of arms reach. I expect there is an early filter that reduces its context to anything within that radius. That filter would make a tremendous reduction in the variables that the robot is dealing with. Germany the country is not within reach so it is ignored, google vision recognition would "easily" recognize the flag as Geman and this is what the robot is responding to.
I've been waiting for a robo-taxi all my life. Elon, make it so.
Hands down my favorite Scott episode!! This is the most I've seen him flex his experience and expertise!
I would think Tesla would go for Vision(LVM?) before LLM
Should be a fun watch
40:00 You just provided one of my concerns about upgrading my Model Y to get version 4 hardware. To take full advantage of FSD changes for even 5 years running may require version 5, 6, or 7 hardware. Right now even the sensors and their layout are still in flux.
GLAD I WATCHED TO THE END 🤗😎😎
34:20 Safety, safety, safety. That is the real AI challenge.
You two rock thanks
We are seeing Tesla moving to make FSD into a full virtual world AI model that can interface with everything from our cars to human form robots. It's an order of magnatude bigger than most of even those in the AI field have grasp yet. This will let all our devices have the ability to interact with world. We are truly seeing a Alpha Go moment as so many areas of AI are reaching a point of development that allows them to become one unified system. Once this is done we will see it grow and learn so quickly people won't believe it. I suspect that Elon has been watching and waiting till he saw that the time was right for this to happen. Much of what we are seeing was already there in his mind just waiting for him to see the signs. You wonder just what Elon is really pondering and how Tesla, Twitter, FSD, Optimus, and Dojo will finally come together. Eventually one system for all of their needs could be developed. It would recognize what system it was controlling and just provide what it needs to function. That would result in one mothership to maintain and one AI system operating using one virtual copy of the world that uses vision and converses and understands the world through a large language model. These next few years will see disruption on a scale never seen before.
For the rock as a hammer scenario, I would actually have like a glass bottle in the mix, to see what kind of variables it would decide on. Because you could technically use a glass bottle to hammer in a nail, there is just a certain percentage risk of it shattering.
DM's robotic extension is limited to motion on a flat plane. O, probably, has already been fitted with sensors that will allow it to go from it's shuffling gait to a steppage gait while being safe enough, despite it's weight, that it won't crush an egg, a stair, or a foot.
Deep Mind's (DM) robotics is way behind FSD and Optimus (O). DM's hand is straight out of Lost in Space, it can't grasp, and can't tell when it is crushing what it is holding. O's hand is humanoid. It can be used to sort complex shapes into different boxes after a single traing example, and can pick up a ripe mango without crushing it.
You fellas would fit right in with those cyberdynes
THANKS JOHN,SCOTT…🤗 for explaining in layperson’s terms 🤔🤯AMAZING THINGS 😎💚💚💚
Robot, give the knife to Timmy. Robot, stick the knife in Timmy. I'm not so optimistic about this brave new world.
Hi dr know it all
I’m in Australia
Can’t get an X or S
If Tesla goes to wire steering maybe also breaking and acceleration the same , could that mean that car drives could be altered left to right very easily ?
Great topic, listening.
Great to hear that your mum is doing so well.
After I heard about her ordeal, that was the first thing that came to mind - did she have a NDE?
@DrKnowItAll Would the next level be tokenizing actions on the fly as situations may change?
Google was smart to use Facebook to train DM how to move. Tesla can do the same and can take full advantage of the humanoid form and mobilty demonstrated in those videos.
My thoughts go to implementing uses with extent existing surgery robots.
50:00 it is prison 😢❤
I'm from Jersey, During College (Early 80s) I worked part time for a Company called 'ORS' (Object Recognition Systems) in Princeton. I tried to find them and don't know where they went. Maybe John knows who I am talking about. I wirewrapped/PCB design and helped with programming Robotic Arms. They were going to hire me, but I went to IBM instead.
No Launch here 😮😅
21:00 Show me what DM knows about safety.
FSD probably can't be solved using a single AI model. We should be envisioning FSD as at least the conjunction of a 4D real world AI, a navigation route planning AI, ... AND a supervisor AI to control the system.
"A ring to control them all."
1X are working on Neo that uses this same solution
28:10 Those tasks take a lot of precision and presumed knowledge of the objects. Watch a toddler try to perform them. Try to pickup and shoot a basketball filled with lead.
This tells me that Tesla's "moat" of having huge amounts of real world data is not that significant. Can anyone comment?
The robot is mostly static and is responding to comparatively simple commands - although with good grip control and a decent visual system (although relying on lidar for ranging which is a bit of a cheat).
Tesla's Optimus is intended to be fully mobile and able to navigate its environment just using passive visual receptors, while responding to natural language requests. What this robot is doing is an impressive party trick but a whole order of magnitude easier to achieve than what Tesla is trying to accomplish.
24:47 And vastly simpler than what current FSD Beta Tesla cars are doing now. The moat is still very wide.
Consider Tesla's data collection to be the ChatGPT dataset. You can build any sentence using the ChatGPT dataset. With a large enough FSD dataset you can simulate driving trough any scenario. You need that dataset to build the algorithm that will let you drive through almost every scenario using a relatively inexpensive computer that will fit under your seat and won't drain the battery before you arrive at your destination.
This Google robot would have been Earth-shaking shortly after SRI's Shakey in 1968. Not too impressive now.
41:00 The new lead. Nanoplastics, immortal chemicals, global warming. Engineering safety failures.
DM is not following commands. DM is self interpreting language input and creating its own solutions. The solutions shown have numerous errors and as a biomedical enginner I see no evidence of any safety engineering. It is much more difficult to do tasks safely than it is to step through a newly derived task. I would like see the failures. What if the cup was a hand or the chips were a container of poisonous material? There is a good reason why O's gait changed from steppage to a shuffle. I'd like to see the trash can of O crushed friut.
One of the earliest AIs, SHRDLU created by Terry Winograd at MIT in 1968, instructed a robot arm in natural language. We've obviously come a long way since then, but this AI doesn't seem remarkably innovative to me. Natural language processing has progressed remarkably. The robotics illustrated here doesn't encourage me to expect humanoid robots any time soon.
The failure of this approach is that it requires a super computer on the web to assist it's actions. The same will be true of Tesla's initial efforts. This will never scale due to expense
Supercomputers are needed for the modeling and learning. Much smaller computers can be used for real world applications of the derived algorithms. A somewhat poor example: You need a large factory to build 4680 battery. You don't need the factory to use the battery.
@@darwinboor1300 So why does ChatGPT only work in the cloud on super-computers and not your laptop?
When my algebra teacher asked to show steps, I always included a few nonsense steps, like multiplying the decimal points.
Would it be appropriate to interview Sandy Munro concerning say the manufacturing ability of the robots or is that out of his wheelhouse?
Nevermind, vid answered my question. Sorry.
Googles billions of dollars bring us the true bumble robat. Amazing that they could make public such a ridiculous display.
It appears to me that Tesla is way behind Google in autonomous robots
All this conjecture by youtube engineers grates on my nerves. This is absolutely the ONLY thing ai will replace in tech industries, rampant human bs.
If Google Home is any indication of their ai, then it's a disaster, or they just don’t care about their customers. It has deteriorated, they stopped working on what they sold customers. It is basically a $100 kitchen timer. Lol. They lost me as a customer, which is sad so early in their journey. Sorry for the rant, but their integrity is shot.
FRP is overrated… sounds cool though
This resembles "SHRDLU, a natural language processing system created by Terry Winograd in 1970."
To clarify, an example SHRDLU dialog might consist of the following:
Person: PICK UP A BIG RED BLOCK.
Computer: OK.
Person: GRASP THE PYRAMID.
Computer: I DON'T UNDERSTAND WHICH PYRAMID YOU MEAN.
Person (changing their mind): FIND A BLOCK WHICH IS TALLER THAN THE ONE YOU ARE HOLDING AND PUT IT INTO THE BOX.
Computer: BY "IT", I ASSUME YOU MEAN THE BLOCK WHICH IS TALLER THAN THE ONE I AM HOLDING.
Computer: OK.
Person: WHAT DOES THE BOX CONTAIN?
Computer: THE BLUE PYRAMID AND THE BLUE BLOCK.
Person: WHAT IS THE PYRAMID SUPPORTED BY?
Computer: THE BOX.
Person: HOW MANY BLOCKS ARE NOT IN THE BOX?
Computer: FOUR OF THEM.
Person: IS AT LEAST ONE OF THEM NARROWER THAN THE ONE WHICH I TOLD YOU TO PICK UP?
Computer: YES, THE RED CUBE. (wikipedia)
The SHRDLU research was exciting because it allowed the user, albeit in highly constrained circumstances, to communicate directly with the computer in English, rather than having to learn a machine programming language.
projects.csail.mit.edu/films/aifilms/AIFilms.html
Swing low sweet robo taxi - coming for to to carry me home ???? - hope your mother is doing well
L😆L ❤
Cool robot but it does not seem to be very usefull.
@47:53 Not *Blade Runner.* Go back a few more years to *Damnation Alley* by Roger Zelazny. (†)
Come to think of it, the vehicle looks like a *CyberTruck* _on steroids._
†) Film adaptation: [ en.wikipedia.org/wiki/Damnation_Alley_(film) ]