Just had a quick look at your paper, great work and thanks for sharing. Quick question: For GP controller, is it right that you sample from the distribution of the policy until a feasible action found? What if the probability of a feasible sample is very low in a certain situation?
Thanks, we are glad you enjoyed it. During deployment we do not need to re-sample until it's valid, we only need a compute the mean of the policy's distribution to generate phase plans. That's the point of formulating the MDP in order to train the policy with RL; instead of using it like in sampling-based-planning methods, we train the parameterized policy distribution with RL so it learns to always output valid phase transitions.
This visualization was made in raisimOgre so unfortunately there is no easy-to-use configuration to share. Stay tuned for when we release the code though.
Looking forward for part 2 !
7:44 looks straight out of an 80's retro game where people ride robots instead of cars
wow ... nice results!
Nice work!
Love it. Thank You!
Hi Vassilios!
Waiting for part two
I am in Love ! 😍❤️
这个真牛,讲的比较深入了
Just had a quick look at your paper, great work and thanks for sharing.
Quick question: For GP controller, is it right that you sample from the distribution of the policy until a feasible action found? What if the probability of a feasible sample is very low in a certain situation?
Thanks, we are glad you enjoyed it. During deployment we do not need to re-sample until it's valid, we only need a compute the mean of the policy's distribution to generate phase plans. That's the point of formulating the MDP in order to train the policy with RL; instead of using it like in sampling-based-planning methods, we train the parameterized policy distribution with RL so it learns to always output valid phase transitions.
@@leggedrobotics Thanks, understood.
@@leggedrobotics Hi, Is it possible to fast forward the learning process so that the robot can spend 1 million years learning in only a few weeks?
why is this better than model mpc with pure math optimization? is it just better because it can learn to handle noisy contacts?
what software do you use for simulations?
good work
i have a question! how to get terrain information? IMU? camera(vision), lidar? i wonder how it is
thank you in advance
Waiting so much to see these anymals in action.
This is very good, does this code have open source,Thank you very much!
Is there a way were I could get access to the rviz configuration for the 80s theme? Looks very cool!
This visualization was made in raisimOgre so unfortunately there is no easy-to-use configuration to share. Stay tuned for when we release the code though.
In which software are these 3D simulations done
This work uses the RaiSim physics engine that was developed in-house. Link: raisim.com/
:)
is there any coding to share?
Unfortunately not yet. We do plan to open-source the code later this year though.
@@leggedrobotics that shall be great contribution!
@@leggedrobotics what is the constraint for the speed at which It walks? Does it have to go at that speed or that as fast as possible?
Great work. I hope "part 2: back with vengeance" is a reference to Last Ninja 2 (ua-cam.com/video/Gfkk9BnFB7w/v-deo.html)