Introduction to Trajectory Optimization

Matthew Kelly

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 8 лип 2024
This video is an introduction to trajectory optimization, with a special focus on direct collocation methods. The slides are from a presentation that I gave at Cornell, linked here:
www.matthewpeterkelly.com/tuto...
The journal paper version of this talk, to be published by SIAM Review in December 2017:
www.matthewpeterkelly.com/rese...
The derivation of the cart-pol dynamics can be found here:
www.matthewpeterkelly.com/tuto...
My general tutorial page for trajectory optimization is:
www.matthewpeterkelly.com/tuto...
Наука та технологія

КОМЕНТАРІ • 262

@sambrothie 3 роки тому ⁺²
Thank you so much for all of the incredible resources you have created, Matthew!
@herrefaber6600 2 роки тому ⁺³
It is very rare that someone finds the right level and tone in a video like this. Great job!
@abcddd580 3 роки тому ⁺¹
A very lucid, consice, yet sufficiently detailed explanation. Great!
@arthuriasbeck5015 4 роки тому ⁺⁶
You saved my life Matthew. Your incredibly amazing. Thank you a lot!
@shenge5347 2 роки тому
This is one of the clearest videos I have found on trajectory optimization. Thank you!
@seancheng7887 5 років тому ⁺¹
Thanks Matthew for your nice introduction which helps me have a broad picture about trajectory optimization. I'm working on a project about optimal control of battery packs.
@joaosousapinto3614 6 років тому ⁺¹⁸
Thank you for taking the time to record this. You're an excellent teacher! Hats off.
@MatthewKelly2 6 років тому ⁺²
João Sousa Pinto - Thanks! I’m glad that the video has been helpful.
@vsaihitesh2237 10 місяців тому ⁺¹
This is such an elegant visual explaination for an abstract problem. I finally find it peaceful to have understood this topic, thank you very much!
@pratikprajapati3977 4 роки тому
This video is masterpiece for beginners of NLP........ Infinite Thanks... Matthew
@jonathancangelosi2439 3 роки тому
I'm currently trying to learn about pseudospectral methods for trajectory optimization and this is a fantastic intro to the topic. Thank you!
@crusader0775 Рік тому ⁺³
I am glad I found this. I am robotics newbie trying to learn trajectory optimization and optimal control. Great presentation, hats off. Appreciate your efforts, Thank you.
@royshi4802 4 роки тому ⁺¹
One of the best tutorials on youtube! Thanks
@emmanuelmorales5332 7 років тому ⁺¹
Thank you a lot Matthew. You did an excellent job and I can't tell you how helpful this video has been.
@MatthewKelly2 7 років тому ⁺¹
You're welcome! I'm glad that you found it useful.
@oguzhancebe9833 7 років тому ⁺³
The world is a better place because of the guys like you, Matthew.
@MatthewKelly2 7 років тому
Thanks!
@izharulhaq2436 Рік тому
Amazing introduciton to the field of trajectory optimzation. This lecture should be made complusory for all student in the field. Thanks
@mortezakhosrotabar 3 роки тому
Thank you so much Matthew for your presentation. It was very useful.
@maratkopytjuk3490 2 роки тому
Amazing video, enjoyed watching it! I like the clean formulation and the high level view.
@leekarlson5348 8 років тому ⁺³³
THanks a lot matthew, I spent a few months trying to learn this and this explain everything in 40 mins
@cyrilletatang107 6 років тому ⁺¹⁰
good statement mrs lee. but it is because u surely passed 4 month on it that enable you to get it in few munites
@prajwal2753 3 роки тому ⁺¹
Same here. It was more like I didn't find a particular tutorial explaining this.. but good and concise explanation
@jomurciap 3 роки тому
Thank you Matthew, excellent video.
@kvasios 3 роки тому
An exceptional amount of value. Thanks!
@dellpi3911 3 роки тому
ua-cam.com/video/fRyUf-GY754/v-deo.html
@jcamargo2005 11 місяців тому
Thank you for this excelent presentation!
@binxuwang4960 3 роки тому
This presenyation is so well made thanks a lot!
@awais_arshad 2 роки тому
Thanks a lot Mathew. This tutorial is very helpful.
@BernhardWullt 10 місяців тому
Great presentation! Thanks!
@furiousspirit 7 місяців тому
Just wow. Thank you so much!
@CISMD 7 років тому ⁺⁵
Thanks a lot Matt. I'm doing research in spacecraft trajectory optimization with machine learning. This clarified things for me. Cheers.
@MatthewKelly2 7 років тому
Thanks!
@dongdongzheng5990 2 роки тому
Thank you for the presentation👍👍👍👍
@yosinhu5952 2 роки тому
Thank you，Great work!Hope u got a good day!
@mindai3415 6 років тому ⁺²
This is so helpful! Thank you sooooo much!
@MatthewKelly2 6 років тому
I'm glad that you liked the video!
@arshdeepbedi6379 5 років тому
thank you mathew
@jinhocho94 3 роки тому
Great lecture and kindness. thanks Matthew!! You helped me a lot
@dellpi3911 3 роки тому
ua-cam.com/video/fRyUf-GY754/v-deo.html
@megoo2611 7 років тому
great work.we need more lectures
@MatthewKelly2 7 років тому ⁺⁵
Thanks! I plan to make more, but have been busy with a new job and writing a tutorial paper on trajectory optimization (to be published soon!). Here is a rough list of topics for videos that I would like to make, somewhat in order:
--> Trajectory optimization Q&A, based on comments from this video and emails that I've received.
--> Introduction to orthogonal collocation
--> Quadrature methods (trapezoid rule, simpson quadrature, rhomberg quadrature)
--> Numerical methods for initial value problems (Euler's method, Runge--Kutta, ...)
@venomdedpol1179 3 роки тому
Thank you so much!
@leanneraw3917 7 років тому
Great video, nice breakdown
@MatthewKelly2 7 років тому
Thanks!
@leekarlson5348 8 років тому ⁺⁴
This is amazing!!!
@MatthewKelly2 8 років тому
Thanks!
@yacinebenameur8028 3 роки тому ⁺²
So freaking helpful, thank you man
@dellpi3911 3 роки тому
ua-cam.com/video/fRyUf-GY754/v-deo.html
@edward090909 7 років тому ⁺²
Hi Matthew,
I have another question. Suppose we have a biped robot that we want to design a controller for. Let's say we formulate our Trajectory Optimization Problem over one step ensuring constraints such as periodicity and realistic ground reaction forces at the stance and swing feet. What are the next steps to take it to a real robot that has to be run for more than one step?
I thought of a PD controller that tracks the resulting state trajectory from Trajectory Optimization, but that sounds less elegant. How did you go about it?
Thanks,
Umer
@MatthewKelly2 7 років тому ⁺⁴
Hi Umer,
Great question. The short answer: walking robots are hard and there is no one accepted way to generate and stabilize a gait. Long answer:
One approach is to generate an optimal trajectory for the full model (like you suggested) and then construct a tracking controller to stabilize it. This turns out to be hard if the robot has small feet (like a person). The issue is that an open-loop time-dependent reference trajectory cannot handle high-level balance concerns (eg. I'm moving too fast). The best solution that I know of is to use Hybrid Zero Dynamics, largely developed by Jesse Grizzle. The idea is to rewrite the optimization in terms of the phase (angle of the stance leg), rather than time.
The more common approach for walking control is to use a simple model (inverted pendulum, linear inverted pendulum, spring-loaded inverted pendulum) and then figure out how to control that (capture point, zero moment point). This makes it easy to do the optimization at run-time (eg. model-predictive control) to handle balance concerns. Once the bulk-motion of the robot is specified, a complicated low-level controller deals with the specifics of the joints. This generally works out to be a quadratic program, which gives the target position, velocity, and torque for each joint motor. Then a trajectory-tracking controller on each joint will compute the current at the motors. Look up Jerry Pratt or Russ Tedrake for some of the most current research along these lines.
If you're curious about more specific details on methods or references, then send me an email. Contact info is on my website: www.matthewpeterkelly.com/
Matt
@benjy1969 3 роки тому
very good, thanks
@user-nv2yo3fn5i 3 роки тому
Thank you so much! It really helps me a lot. Recently, I am trying to learn the pseudo-spectral method for optimal control problem. Can you recommend some good materials? Many thanks!
@opq9901827 3 роки тому
Really nice video! I learn a lot from it.
@dellpi3911 3 роки тому
ua-cam.com/video/fRyUf-GY754/v-deo.html
@JAA1PA 7 років тому
Thanks a lot for this!
@MatthewKelly2 7 років тому
You're welcome - I'm glad that it was useful!
@desenzhang333 3 роки тому
thankyou Matthew
@dellpi3911 3 роки тому
ua-cam.com/video/fRyUf-GY754/v-deo.html
@hiaaa9394 Рік тому
Thank you Kelly. I still have a question. If the decision variables in the trajectory optimization problem contain variables other than control variables, state variables and time, can optimtraj solve this problem? Or is there any other way to solve it? Such as the dual multiplier introduced to better solve some nonconvex optimizations.
@MatthewKelly2 Рік тому
In general, you can add just about anything you want as a decision variable in a trajectory optimization problem. One common example would be slack variables for a constraint, or a parameter in the dynamics model. Specifically in my OptimTraj Matlab package there is some limitation on the types of decision variables, which I did mostly to keep the user interface simple enough to use in an educational context. The GPOPS-II optimization package (also in Matlab) has a bit more flexibility, for example, allowing model parameters to be decision variables. I'm sure other packages have more flexibility as well. Or you could code up your own transcription, and then the sky is the limit. The key thing to keep in mind is that you will need to be careful that you still have a "well behaved" problem. For example, you'll still need consistent / continuous gradients of those new decision variables.
@heechanshin9859 6 років тому
Thanks a lot and a lot and a lot! This tutorial is exactly what I need and want to study! Your teaching is very clear and easy to understand! Thanks again! ( and I'm surprised you are the author of TrajOpt! lol)
@MatthewKelly2 6 років тому
ShinHeechan Thanks - I'm glad that you found the video helpful! Hopefully I'll be able to make a few more on related topics at some point.
@MichaelJacobMathew 8 років тому ⁺¹
Thanks Matthew for this nice presentation. Slide no:5 Could you please suggest some resources for handling the if statements inside optimization? (Discontinous dynamics)
@MatthewKelly2 8 років тому ⁺⁶
There are a few techniques: 1) multi-phase techniques, 2) through-contact techniques, 3) constraints + slack variables, 4) smoothing. The best choice depends on your specific problem. Details below:
1) If you have a small number of well-understood discontinuities, such as foot collisions with the ground during walking, that occur in a known sequence, then the best option is to pose a multi-phase trajectory optimization problem. This is also commonly used for rocket trajectories, where there are multiple engine phases. The best resources here are by Anil Rao (www.anilvrao.com/), who is the author of GPOPS-II (www.gpops2.com/), a software specialized for this type of problem. He has several papers on the topic. I also might be able to find some examples if your interested.
2) Through-contact (contact invariant) optimization is best for systems with more complicated contact and/or discontinuities, where the sequence of discontinuities is not known. I know less about this, but the principle is to include some special constraints and extra decision variables to force all of the discontinuities into constraints, leaving the resulting NLP smooth. The two best papers are by: Michael Posa & Russ Tedrake (A direct method for trajectory optimization of rigid bodies through contact) and Igor Mordatch & Emanuel Todorov (Discovery of complex behaviors through contact-invariant optimization).
3) If your discontinuity is simple, such as an absolute value sign in the objective function, then there are many clever tricks with slack variables and constraints. These are well-covered by John T. Betts in Practical Methods for Optimal Control and Estimation Using Nonlinear Programming, and in any other good book on linear, quadratic, or non-linear programming. I have an example of this in my trajectory optimization software: github.com/MatthewPeterKelly/OptimTraj --> OptimTraj/demo/minimumWork/MAIN_cstWork.m
4) If you have a simple non-linearity, then you can replace every instance of the discontinuous function with a smooth approximation. This tends to be simple and fast, but less accurate than the slack variable method. There are two types of smoothing: exponential and polynomial. If you use polynomial smoothing, make sure it is smooth to second order at least. I have some examples for trajectory optimization in OptimTraj:github.com/MatthewPeterKelly/OptimTraj --> OptimTraj/demo/minimumWork/MAIN_smoothWork.m
and for the base smoothing functions in:
github.com/MatthewPeterKelly/Exponential_Smoothing_Matlab
github.com/MatthewPeterKelly/Polynomial_Smoothing_Matlab
Good luck!
@kabirabdulmajeed5185 7 років тому ⁺²
I love this
@MatthewKelly2 7 років тому
Thanks!
@sonyerric2 3 роки тому ⁺¹
Thank you for the great video, I was playing around with the demos found on Matlab and was wondering (regarding a simple pendulum attached to a cart) if the length of the pole can be varied with respect to time? If so how can I make the necessary modifications to allow this changing length
@MatthewKelly2 3 роки тому ⁺¹
Good question! In the demo that I wrote, it is a hard-coded assumption that the pole length is fixed. That being said, you can definitely make a version of the dynamics function that supports a time-varying pole length.There are three ways that I could imagine doing it. (1) Add a passive element, such as a spring, that applies along the pole and allows the mass to move along the length. This would require writing out a new dynamics equation, but would not require adding new inputs to the function. (2) Explicitly make the length of the pole a function of time and then pass that function in as a parameter. Then take the time that is passed into the dynamics function and use it to evaluate the pole length (and its derivatives). Then you need to update the dynamics to properly account for accelerations related to changes in pole length. (3) Add an actuator that can control the length of the pole (easiest to use a force actuator here), and then update the dynamics accordingly. This would allow the optimization to "select" the correct length for the pole. In all cases you would need to update the dynamics of the system, but that shouldn't be too tricky as it is a relatively simple system. Good luck!
@MatthewKelly2 3 роки тому ⁺¹
Good question! In the demo that I wrote, it is a hard-coded assumption that the pole length is fixed. That being said, you can definitely make a version of the dynamics function that supports a time-varying pole length.There are three ways that I could imagine doing it. (1) Add a passive element, such as a spring, that applies along the pole and allows the mass to move along the length. This would require writing out a new dynamics equation, but would not require adding new inputs to the function. (2) Explicitly make the length of the pole a function of time and then pass that function in as a parameter. Then take the time that is passed into the dynamics function and use it to evaluate the pole length (and its derivatives). Then you need to update the dynamics to properly account for accelerations related to changes in pole length. (3) Add an actuator that can control the length of the pole (easiest to use a force actuator here), and then update the dynamics accordingly. This would allow the optimization to "select" the correct length for the pole. In all cases you would need to update the dynamics of the system, but that shouldn't be too tricky as it is a relatively simple system. Good luck!
@sonyerric2 3 роки тому
@@MatthewKelly2 I will try to tinker with these ideas. Thank you so much for your suggestions!
@kgandhi24 5 років тому
Hi Matthew,
Thank you very much for your video as well as the tutorial paper.
I am confused between Trajectory Optimization and Sample based planning.
Can't we use Sample based planning output as the initial path for Trajectory Optimization ?
Kindly help me on this
Regards
Kandarp
@MatthewKelly2 5 років тому ⁺¹
Hi Kandarp,
Short answer: yes.
Long answer: Sample-based planners are generally used to compute complicated configuration-space solutions to path planning problems in task-space, for example moving a 7DoF arm through a cluttered environment. These planners have a hard time handling dynamics. Trajectory optimization (in the sense of this video) is complementary: it is great at handling complicated dynamics but struggles with complex tasks. There are many ways to combine the two, although I'm a bit fuzzy on the details. It would be great to use the output of a sample-based planner to initialize a trajectory optimization, but the trouble is how to represent the task-space constraints efficiently in the trajectory optimization. This is of course possible, but challenging. I think that the TrajOpt library (rll.berkeley.edu/trajopt/doc/sphinx_build/html/) does something close to this.
Matt
@puzzlepencilholder 5 років тому ⁺¹
Super clear explanation! If I understand correctly, shooting method only has control as the decision variables while collocation methods have state and control as decision variables. What is the tradeoff between more decision variables but sparser optimization? And has one of these methods won out over time as 'better' or is it still problem dependent?
@MatthewKelly2 5 років тому ⁺¹
That's basically correct: the decision variables in single shooting are the controls, while collocation methods have both state and control as decision variables. Multiple shooting is a gray area in between. In general, collocation methods will be "better" than single shooting methods, and would be my first choice for the majority of problems. One of the key reasons is because the jacobian of the constraints are more sparse, and the linearization changes less between iterations. They also do better with complicated constraints. I can think of a few situations where shooting methods might be superior, although it will depend on problem details: 1) if you have a really good initial guess, then the linearization issues associated with single shooting are less of a problem and the smaller problem size might be helpful; 2) if you need a high-accuracy dynamics solution, but only an approximate control solution, then you can use multiple shooting with one control decision per shooting segment, but many dynamics steps. This is suggested by Betts (in his review paper) for control of spacecraft, which have periodic impulsive control, followed by passive coasting; 3) collocation methods rely on sparse optimization solvers -- if you only have access to a dense solver, then it is possible that a shooting method will be faster than collocation, provided that you have a good enough initial guess to ensure convergence.
@puzzlepencilholder 5 років тому
Matthew Kelly makes sense thanks!
@MinhVu-fo6hd 4 роки тому
Hi Matthew, it is such a great summary of trajectory optimization and the direct collocation method!
However, since using the nonlinear dynamics as collocation constraints yields a general NLP (nonconvex), do we have any result/reference to guarantee local convergence for that? Moreover, for extremely nonlinear dynamics, using those NLP solvers (fmincon/Gekko), I couldn't even find a feasible point. I think finding a solution to a feasibility problem (in optimization) is particularly hard for a general nonconvex problem. Do you have any experience with or recommendation for these issues? Thanks a lot!
@MatthewKelly2 4 роки тому ⁺¹
Thanks! In general, there is no good way to guarantee the convergence of a NLP. Even with a well-posed problem, you can often end up with the solver failing if you give it a bad initialization. In some cases, you might give it what seems like a good problem, but the problem turns out to have a singular arc or has some other difficulty that is not obvious at first. I can't think of any good references for proofs, but I do have some general suggestions for getting the optimization to work. My general approach is to start by adding a good regularization (typically path integral of input squared), and removing difficult constraints. Once that works, then slowly add back in the constraints and relax the regularization. Good luck!
@muhammadmurtaza8558 7 років тому
Hi Mathew,
Thanks for the tutorial. I had the query that can we optimize the final position of the state only. I mean we donot want it to end at the specific position but rather land on optimal position. Is it possible to do so?
@MatthewKelly2 7 років тому ⁺¹
Hi Muhammad,
You can have a trajectory end at an optimal position. The way to do this is to specify a boundary objective function J() as shown on slide 6. The integral objective function w() is optional, although it can help make the problem converge faster in some situations. If you don't care about the final position, but just want the path to be optimal, then you can just remove the bounds on the final state. I hope that this helps and let me know if you have any follow-up questions.
@muhammadmurtaza8558 7 років тому
Thanks a lot Mathew.
@yalunwen486 3 роки тому
Thank you Kelly for this great presentation. Just wondering what did you use to insert equations into the presentation? Thanks again.
@MatthewKelly2 2 роки тому
If you follow the first link (www.matthewpeterkelly.com/tutorials/trajectoryOptimization/cartPoleCollocation.svg), you'll discover a big SVG file that has all of the slides in it. I created this file using Inkscape. There is a LaTeX plugin for inkscape, although it was pretty unreliable at the time of writing this presentation. The animations (really just links to different views onto each slide) were auto-generated using Sozi, and animation front-end for Inkscape.
@hassoufayoub8912 5 років тому
Hey Matthew! Thanks for your help . Can I get the longer list of references? Thanks in advance
@MatthewKelly2 5 років тому
Hi Hassouf, The best list of references for trajectory optimization that I have is the one at the end of my tutorial paper. You can find the paper and the references here: epubs.siam.org/doi/10.1137/16M1062569 (just click the pdf link on the page to download the paper as a pdf).
@kouegoukamenboris3685 7 років тому ⁺¹
Hi! i'm working on large deviations in probability. i really enjoyed your video. Thanks a lot. But i have a question. When we have to minimize also the final time (with no user bound), i guess the discretization should not be quite different ? How do we handle that case? thanks
@MatthewKelly2 7 років тому
Hi Kouegou - Thanks for the question! In a fixed-duration trajectory optimization you should have the final time as a parameter. In a variable-duration trajectory optimization the final time becomes a decision variable. In both cases you should compute the intermediate time grid as a function of the final (and initial) time.
The major different is in the type of underlying parameter optimization problem: if the final time is a decision variable, then the resulting optimization must be a non-linear program. If you're curious about implementation, take a look at the source code that goes along with the presentation: github.com/MatthewPeterKelly/OptimTraj
@kouegoukamenboris3685 7 років тому
thanks Matthew. I will look slowly.
@shrutimisra751 7 років тому ⁺²
Hi Matthew,
Thank you for the video, it was very informative. I've recently started working on gait optimization for bipedal and quadrupedal robots and am looking to references/books that can help me build up a solid background on optimization theory in this context. I was wondering if you have any favourites/suggestions ?
@MatthewKelly2 7 років тому ⁺²
Hi Shruti,
There are many resources for learning trajectory optimization and bipedal locomotion, here are just a few:
Russ Tedrakes underactuated robotics course is a great overview of trajectory optimization and
its applications in robotics.
www.edx.org/course/underactuated-robotics-mitx-6-832x-0
John T. Betts - Practical Methods for Optimal Control and Estimation Using Nonlinear Programming.
This book is excellent. I suggest reading the first four chapters several times.
epubs.siam.org/doi/book/10.1137/1.9780898718577
My tutorial paper on trajectory optimization - this collects much of what I learned in grad school.
There is an example for legged locomotion in the paper.
www.matthewpeterkelly.com/research/MatthewKelly_IntroTrajectoryOptimization_SIAM_Review_2017.pdf
There are two good review papers on trajectory optimization, one by Anil Rao and the other by John T. Betts.
I also have a general web page for learning about trajectory optimization:
www.matthewpeterkelly.com/tutorials/trajectoryOptimization/
As for trajectory optimization in bipedal walking, take a look at the papers by
Jesse Grizzle, Aaron Ames, Russ Tedrake, Michael Posa, Scott Kuindersma.
Best of luck! Feel free to email me or ask follow-up questions here.
Matt
@shrutimisra751 7 років тому
Wonderful ! Thank you for all the resources. I appreciate it very much !
@MatthewKelly2 7 років тому
You're welcome - glad to help!
@karthikmr416 7 років тому ⁺¹
Hi Matthew,
I have a doubt pertaining to the initial guess used in slide 17. Does the initial condition satisfy the collocation constraints (dynamics) in slide 15 or is this condition outside the feasible region ? Thank you
@MatthewKelly2 7 років тому ⁺¹
Karthik MR - great question. The initial guess is not feasible, but this is intentional. In general we cannot construct a feasible initial guess without solving a non-linear program (the very thing that we are initializing).
The initial guess is generally used to guide the optimization towards a reasonable local minimum, and the initial guess provided on slide 17 does just that. For some difficult problems it is good to provide a feasible initial guess, and this is generally done by solving a sequence of non-linear programs, starting with a more-simple optimization problem and then moving towards the true problem. This is discussed briefly on slide 16.
@karthikmr416 7 років тому
Thanks a lot Matthew :)
@EastAfroEntertainment 5 років тому
Can you do tutorial for drone trajectory optimization too? like minimum time optimization
@MatthewKelly2 5 років тому ⁺²
That would be a great topic for another video, but for now I'll have to leave it to someone else. Thanks for your interest though!
PS. You can use the techniques in this video for minimum-time optimization, but you will have to be careful about regularization and problem formulation to get something that performs well numerically. The more-exact (kinematic) methods typically require computing the exact switching times for path constraints (eg. switching from maximum acceleration to minimum acceleration), which is challenging. Try writing a program to compute the minimum time to travel between two points with constant speed and acceleration limits. That is a good (and tractable) practice problem to get started on.
@sonyerric2 3 роки тому ⁺¹
Hi, is it possible to incorporate obstacle avoidance for the direct collocation methods presented here? Are there some resources that talk about this or certain keywords I could search for ??
@MatthewKelly2 2 роки тому ⁺¹
Yes, although the details start to get complicated. The general idea is to add a distance-squared constraint at each collocation point. You need to be somewhat careful about avoiding "tunneling". Depending on the complexity of your kinematics, these additional constraints can be expensive to evaluate. I can't think of any good references off the top of my head, but I'm sure that they're out there.
@emanuel4516 2 роки тому ⁺¹
Hi Matthew, how would you implement a constraint on a intermediate point of the trajectory? Let's say you want to avoid the tree in the Cannon Shooting example you brought on your website. Is it possible to do this with a Single Shooting method or do you have to rely on Multiple Shooting or Direct Collocation method?
@MatthewKelly2 2 роки тому ⁺¹
You can add a constraint on an intermediate point in either method, although the gradients are much nicer in the direct collocation formulation. The approach is to select one or more knot points at which to apply the constraint, and then append them to the optimization problem. You always need to watch out for "tunneling". For example, if you have widely spaced collocation points, you can compute a "feasible" trajectory that passes directly through a thin wall if each of the collocation points is "not in collision" with the wall. There are various heuristics for avoiding this, most either based on "padding" the constraint (making the wall wider than it really is) or adding extra collocation points near obstacles.
@emanuel4516 2 роки тому
@@MatthewKelly2 thanks for the answer! I am thinking on how to implement this intermediate constraint though. Maybe using a conditional statement (in the tree example one could do something like this:
IF x=obstacle_coordinate AND
y
@MatthewKelly2 2 роки тому
@@emanuel4516 - Correct. The 'if' statement would be no good, as it causes a discontinuity in the sparsity pattern of the constraint jacobian. You'll want to add a constraint something like: 'distance_to_obstacle(state(i)) > 0', and then apply that that constraint for every single point on the trajectory that could come in contact with the tree. This allows the NLP solver itself to figure out if the constraint is active or not. The 'distance_to_obstacle' function should be continuous (no if statements).
@ethanepp849 2 роки тому
You mentioned you had some references for working on multi stage problems and I was wondering if you would be able to link them? If it helps the specific problem I am working with the OptimTraj library you made and beginning with working on is a sort of toy problem that is just a point mass navigating a 2d obstacle course with obstacles it can't cross as path constraints and a series of points that is has to reach along its path. I am very new to this sort of thing so any comments or references are greatly appreciated. Thanks in advance, and great video!
@MatthewKelly2 2 роки тому ⁺¹
That sounds like an interesting problem to work on and learn from. I recommend starting simple and then adding constraints until you get to solving the full problem. For multi-phase references, look up the GPOPS-II user manual and associated papers by Anil Rao et al. There are definitely others out there, I just don't know them off the top of my head. The big picture concept with phase constraints is that you are connecting multiple "mini trajectory optimization problems" into one larger problem using "linkage constraints". The linkage constraints connect the boundary state on one phase to the boundary state of another phase. Interestingly, there is nothing that requires them to be sequential -- in general they can form a graph. The main challenge comes in setting up a clean user interface for specifying them, which is why they are not included in OptimTraj -- I wanted to keep it as simple as possible to use. That being said, they are not too challenging to hack into that software for a specific case.
One side note -- my OptimTraj package is "optimized" for being easy to read and use as a learning tool, while still being fast enough to use for toy problems. The problem you are setting up is complicated enough that you are probably going to start running into some of the limitations of both OptimTraj and Matlab itself. If you're tied to Matlab, then I suggest switching to GPOPS-II, which is heavily optimized for performance on larger multi-phase problems. If you're not tied to Matlab, then I would suggest hunting around for a C++ implementation (perhaps with python bindings, if they exist) that would be faster.
@logimatic4070 5 років тому
Hello Matthew, thank you for the great presentation!
I've been struggling with a problem for some time and would appreciate if you (or someone ;)) could point me in the right direction for how to tackle it:
How do we find the time-minimal trajectory of a point on a plane with a maximum acceleration constraint, such that it starts with zero velocity at a given position and passes through a given set of fixed points on that plane in a given order?
I imagine we should keep applying the maximum acceleration to the point throughout the movement, and only work on its orientation (acceleration, braking, cornering…) but would it be a continuous variation or a finite number of fixed orientations? can it be solved analytically or should it be numerically?
Thanks a lot for any help or references!
@MatthewKelly2 5 років тому
Thanks! Great question: How to do minimum-time trajectory optimization with acceleration constraints? It is surprisingly hard to do well. I think that the state of the art in this is the Reflexxes motion planning library. They link to several reference papers on their home page. There are a few special cases that can be solved analytically, for example, moving in a straight line between two points with arbitrary boundary conditions. Once you get into vector trajectories (of which a 2D plane is the simplest case), then it gets a lot more complicated. The solution to a minimum-time optimization will always lie on a constraint, so the minimal set of information to describe the solution is the set of switching times and which constraints are active between each pair of switching times. This turns out to be a nasty optimization: the number of switching times is unknown (integer programming) and the state is a nonlinear function of time. There are roughly two approaches: 1) smooth out the problem (eg. add a cost on jerk) and then use an optimization like I describe in this video. You could also use something like GPOPS-II, which automatically will refine the mesh near the switching points. 2) work out the logic to determine which constraints are active ahead of time, and then do root finding to compute the mode-switches. Good luck! Start with reading through the Reflexxes papers, and then after that you might find some good papers by Anil Rao (the author of GPOPS-II) on related topics.
@logimatic4070 5 років тому
@@MatthewKelly2 Thanks a lot for your comprehensive answer!
@MatthewKelly2 5 років тому
@@logimatic4070 You're welcome!
@edward090909 7 років тому
Hi Matthew,
Thanks for this amazing tutorial you have put together. I have used this a lot to formulate my problem.
I have a specific question though about using GPOPS 2. I give an initial guess for initial and final states, and as it happens with the other methods (trapezoid, hermite-simpson etc), an interpolation between the initial and final state is done to find the states at intermediate grid points. Now, my problem is that the interpolation method used by GPOPS 2 brings in a state that gives NaN for one of the path constraint functions. I had this problem when using hermite-simpson too, but I changed the interpolation method option (in 'interp1') from 'Linear' to 'Nearest', and it bypassed that problematic state. I wonder if there is a similar way to change the interpolation method in GPOPS 2.
Thanks,
@MatthewKelly2 7 років тому ⁺¹
Hi Umer, I'm glad that you liked the tutorial! I'll do my best to answer your question:
1) By default, GPOPS-II will use the `sparseNaN` method to compute the sparsity pattern for numerical derivatives. As a result, it will intentionally pass `NaN` to you constraint and objective functions at the first iteration. This works great as long as you have structured you code to never be able to generate a `NaN` value. It there is a valid input that results in `NaN`, then it will cause problems. You can check this by temporarily changing the setup.derivatives.dependencies to either `full` or `sparse`.
2) It seems strange that linear interpolation should produce a `NaN` while nearest interpolation does not. This makes it sound like something is wrong with your problem setup. I'm not sure how the internals of GPOPS-II work, but I suspect that you will get a slightly different interpolation grid if you change either `setup.method` or `setup.mesh.method`.
Summary: I think that your best debugging strategy should be to look carefully through your constraint and objective functions. See if there are any lines that could produce a `NaN` result, and change any that you find. Also make sure that there are no if statements or other discontinuities in these functions.
Good luck!
Matt
@edward090909 7 років тому
Thanks for the reply, Matthew. I missed one small detail in the question and have one follow up question.
- I face the 'NaN' error only for the initial guess, in the GPOPS 2. That is, for the first time, when interpolation is done, I get a state vector for the middle point that gives me an 'NaN' in the constraint function. It does not go to the second iteration at all after this first NaN.
Question:
- Specifically, 'NaN' appears because of a matrix inverse in the reaction force calculation for the stance leg.
I know matrix inverses are not the best things to use here but I cant think of another way. Do you think there is a better way to calculate stance leg reaction forces?
Thanks!
@MatthewKelly2 7 років тому ⁺¹
When you're doing dynamics for rigid-body systems you can usually do a linear solve, rather than a true matrix inverse. This has much better numerical properties and is faster to compute. In Matlab:
`x = A\b; % Good: do this whenever possible. `
`x = inv(A)*b; % Bad: avoid this whenever possible.`
For debugging, use a conditional break-point to stop the execution in you constraint function when it detects a NaN. See if you can figure out what is causing it. Is there a NaN in either A or b that is propagating into x? If not, then there is likely something going wrong in the way you compute A. For example, is the matrix singular or ill-conditioned? Also make sure that the NaN is not getting passed as input to your function - if that is the case then it is probably part of GPOPS sparsity pattern calculation, or there is a problem elsewhere in the code.
Good luck!
@chaitanya.awasthi 6 років тому ⁺¹
Hello Matthew,
It looks like I am dealing with stiff dynamic system in my research (not entirely sure though). Now, I understand that I can use stiff solvers in MATLAB such as ode23s to simulate the dynamics, however, the trapezoidal and pseudospectral optimal controllers that I coded up fail miserably when the dynamic constraint is stiff. Are you aware of any literature that tries to address this in the framework of optimal control?
@MatthewKelly2 6 років тому
Hi Chaitanya, great question! You are correct: trapezoidal collocation on a uniform grid and global pseudospectral methods will both fail for stiff systems. There are two aspects to the transcription that you need to worry about with a stiff system: 1) choice of grid points, 2) choice of method. For the choice of grid-points you should use mesh refinement (also known as adaptive meshing or grid refinement). The best references for this are by Anil Rao and Michael Patterson, although Betts also touches on it. It is well implemented in GPOPS-II and the papers that describe how it works. These methods are analogous to adaptive step integrators (such as ode23 or ode45). You also want to get the choice of method right, and I know less about this. For example, I'm pretty sure that you don't want to use euler's method. I think that you are likely to be best off with a low or medium order implicit method, although someone else likely knows best. I suggest that you start with trapezoidal collocation and focus on getting the mesh refinement working. You could use OptimTraj as a starting point - it provides mesh analysis as part of the output for both trapezoid and hermite-simpson direct collocation. Just put the optimization call inside a for loop and you're ready to go. The off-the-shelf solution would be to use GPOPS-II, which will almost certainly be faster and more accurate than a home-brew solution (with or without using OptimTraj). Good luck!
Matt
@chaitanya.awasthi 6 років тому
Thank you Matthew for your insightful reply. I will look into the references you provided and will come back to share whatever I find.
@dkaka6687 3 роки тому
Could you shed some light on, slide 6, why the g() term represents constraints for the trajectory to be periodic? Does it mean the trajectory finishes within a certain period always help the cost function to decrease to minimal? Thanks
@MatthewKelly2 3 роки тому
The `g()` term on slide 6 represents *any* nonlinear boundary constraint. One example of such a constraint would be a periodic boundary constraint. A "periodic" boundary constraint means that the state of the system (x) is the same at the beginning and the end of the trajectory. One place where this is commonly used is in optimizing periodic walking or running gaits. There are lots of other types of constraints that can be modeled by `g(x)` as well, it all depends on the application.
@BDEvans 3 роки тому ⁺²
Thanks so much. By far the most useful resource that I have found. Great job
Would you please share the code for this example?
@BDEvans 3 роки тому
If anyone is looking for an example, I coded one using the CASADI library in Python
@BDEvans 3 роки тому ⁺²
How do you do minimum time trajectory optimisation? Do you make time one of the variables?
@MatthewKelly2 3 роки тому ⁺²
Exactly! The easiest way to do minimum-time trajectory optimization is to include the total duration as a decision variable. Then each knot time is computed as an affine function of the total time. Whenever you add time as a decision variable in trajectory optimization you will immediately make the optimization much more nonlinear, due to how time plays into the integration scheme for the dynamics. Then you can make things more complicated by setting up multi-phase problems (each phase duration is a decision variable), and mesh refinement (iteratively selecting better collocation points based on the solution to the previous problem).
@sarveshchakradeo6922 8 років тому ⁺³
Hello Matthew,
In slide 5, you have mentioned discrete time trajectory optimization methods. Can you point to a few resources related to that?
@MatthewKelly2 8 років тому
I've hear that the following textbook by Bertsekas is good: D.P. Dynamic Programming and Optimal Control. Volume 1, 3rd edition. Athena Scientific (2005). There might also be some good information in the book: Applied Optimal Control (1975) by Bryson and Ho. Below I've included the BibTex citation for a paper that looks useful, although I've not carefully read it.
Another thing to note is that there are multiple aspects of the optimal control problem that can be optimized. The references here are primarily for problems that have discrete time but continuous state and control spaces. If you're interested in problems that have discrete time, state, and control, then you get into a different (but related) subject. For those problems, look up: value iteration, policy iteration, markov decision process, graph search, dijkstras method.
@article{Mayne1966,
abstract = {ABSTRACT A second-order method of successively improving a control sequence for a non-linear discrete-time system is derived. One step convergence is obtained for linear systems with quadratic performance functions. Although the results are of interest in their own right, a second-order method for continuous-time systems is obtained by formally allowing the sampling interval to approach zero. The equations so obtained differ slightly, because of the method of derivation, from results already obtained using the calculus of variations approach. The difference, which is an advantage of the method described in this paper, is that one vector differential equation less has to be integrated. The approach used in the derivation is motivated by dynamic programming and facilitates the application of gradient methods to stochastic problems which will be the subject of a future paper.},
annote = {Differential Dynamic Programming (DDP)},
author = {Mayne, David},
doi = {10.1080/00207176608921369},
file = {:home/matt/Documents/Mendeley Desktop/00207176608921369.pdf:pdf},
isbn = {0020717660892},
issn = {0020-7179},
journal = {International Journal of Control},
keywords = {Differential Dynamic Programming},
mendeley-groups = {TrajectoryOptimization,Dissertation},
mendeley-tags = {Differential Dynamic Programming},
number = {1},
pages = {85--95},
title = {{A Second-order Gradient Method for Determining Optimal Trajectories of Non-linear Discrete-time Systems}},
volume = {3},
year = {1966}
}
@sarveshchakradeo6922 8 років тому
Thank you so much!
@rakmo97 2 роки тому ⁺¹
Can you leave tf unconstrained from T? Is it possible that a more optimal solution takes a different amount of time from 2 seconds?
@MatthewKelly2 2 роки тому
Excellent question. Yes, you can make the duration of the trajectory a decision variable, and in many cases that allows the solver to find a "more optimal" solution. One detail is that the total duration then couples nearly every constraint in the optimization problem, making the overall problem more challenging to solve. You can also end up with some counter-intuitive behaviors if you don't set up the objective function and constraints carefully. One related topic: it is generally a bad idea to make the duration of all segments into decision variables, as the solver will often get stuck (Bett's discusses this in his book in some detail). Generally you fix the "mesh fraction" allocated to each segment, and then allow a small number of "phase durations" to be decision variables (see GPOPS-II documentation for more on this).
@klaradks5195 3 роки тому ⁺¹
Hi Matthew thank you so much for the introduction video! Great explanation!
I want to ask you a question. How to handle a control variable that is also a function of one of the state variables? So for example, in the train trajectory optimization problem, one of the control variables is the tractive force and the states are the position and speed. The tractive force profile of the motor is actually dependent on the speed so that the upper bound of the force changes with speed. In this case, how do you define the upper bound of the control variable? Thank you
@MatthewKelly2 3 роки тому ⁺²
Great question. By definition, a control variable is independent of state: it is the input to the system. That being said, the limits (bounds) of a control variable can depend on state. I've shown them on slide 6 as constant, but that was just a simplification for keeping things readable.
@klaradks5195 3 роки тому
@@MatthewKelly2 Thank you
@ziadzamzami 8 років тому ⁺³
Excellent presentation Mathew and thanks for sharing ;-). I have a couple of questions/notes:
1) On slide 4 you've showed a visual representation of the closed-loop solution (global method) is that a phase portrait? how did you calculate the vector field? Do you have good references on these methods ?
2) From personal experience, in practice, decreasing the step size (increasing the number of segments) doesn't necessarily increase the accuracy. The relationship is more subtle, because by increasing the number of segments you increase the size of the NLP and thus it becomes more difficult to solve. So i guess in practice there is an optimal step size depending on the problem.
@MatthewKelly2 8 років тому ⁺⁴
+ziad zamzami Thanks!
--> How did I make this slide? I constructed an arbitrary value function that looked interesting (contour lines). The optimal policy (arrows) is simply the gradient of the value function. I then computed the optimal trajectory by simulating the policy backward in time from the goal. This value function / optimal policy pair do correspond to some real system and objective (reward) function, but I don't know what they are.
--> Background: The solution to any optimal control problem has two parts: an optimal policy (the global solution) and a value function. Given the optimal policy, you can construct the value function, and given the value function, you can compute the optimal policy. These are computed by solving the Hamilton Jacobi Bellman equations, a set of non-linear partial differential equations. Value iteration and policy iteration are two common solution approaches. Just like trajectory optimization, the global formulation of the optimal control problem must be transcribed into something manageable. The simplest thing to do is to just "grid-up" the state and control space, and then solve the equations. Alternatively, you can use general-purpose function approximators, such as neural nets or K-nearest-neighbors, to approximate the policy and value function. One commonly used model is that of a Markov Decision Process.
--> References: The single best "getting started" reference are the lecture notes for Andrew Ng's CS-299 couse at Stanford (cs229.stanford.edu/notes/cs229-notes12.pdf). Next, I would suggest Russ Tedrake's Underactuated Robotics course notes, the chapters on Optimal Control and Motion planning (ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-832-underactuated-robotics-spring-2009/readings/MIT6_832s09_read_preface.pdf). These are both more from the "machine learning" side of things, but you can also approach this topic from the more formal mathematics side as well, which I'm less familiar with.
--> A few professors who write papers on this sort of thing, not in any particular order: Michiel van de panne, Alex Vladimirsky, Andrew Ng, Pieter Abbeel, Ashutosh Saxena, and Chris Atkeson.
@randywelt8210 6 років тому
U are talking about dynamic programming in RL.
@thohoduc5987 6 років тому
Hi Matthew. Thanks a lot for your nice presentation. But I have one question about the Mesh refinement at slide 23. In the "h" method, I guess the knot points are those obtained by the NLP. But I am quite confusing about the "collocation points". I guess that we can find them by searching the entire time span (with much smaller step-size) to identify the points where collocation constraint is satisfied. Is it correct? Besides, in the "Four 2-order segment" figure (slide 23 also), I saw both knot and collocation points are increased. Does it mean that you decrease the step-size in the NLP to obtain more trajectory segments? It is really appreciate if you could answer me. Thanks!
@MatthewKelly2 6 років тому
Great question. The system dynamics are evaluated at the collocation points. The knot points are the boundaries of a polynomial segment. For low-order methods, such as trapezoidal collocation, the knot points and collocation points are identical. For higher-order methods the knot points and collocation points are different, usually with more collocation points than knot points. We select the collocation points first and then add a constraint to the optimization that requires the system dynamics are correct at those points. The choice of collocation points is related to the design of the implicit integration scheme. For example, in gauss orthogonal collocation they are chosen to be the roots of a gauss polynomial. I selected the methods on the slide such that the low and high order methods use the same number of function evaluations. A low-order method will use many smaller segments (thus a shorter "step-size"), while a high-order method will use fewer larger segments. The "step-size" of an integration method is the distance between knot points. Let me know if you have any follow-up questions!
@thohoduc5987 6 років тому
Thanks for a great answer, Matthew. Now I understood about the knot points and collocation points. However, in both "h" and "p" methods, basically we are trying to decrease the "step-size" hence the collocation and knot points increase. Actually, at the slide 23 "Mesh Refinement", I saw that the total number of collocation and knot points are the same therefore the number of decisive variables input to the NLP are the same for two methods. So my question is that: which one between "h" and "p" method is more accurate, and which one is more computationally effective?
@MatthewKelly2 6 років тому
The question of whether to use many low-order segments or few high-order segments has two aspects. The first is how well the discretization approximates the "true" solution, and the second is how the optimization performs using the choice of discretization. The short answer is as follows:
If the solution to the optimization is analytic (fancy term for smooth), then you should always use a high-order method. If there are discontinuities in the solution then you should use a low-order method. For a technical discussion, look up papers on "HP-adaptive meshing" by Anil Rao. Another good reference is the 2010 textbook by Betts that I reference at the end of this video.
If you use a high-order method, it is extremely important that it is implemented correctly, as naive implementations will have poor numerical properties.
The GPOPS-II library for Matlab will automatically determine which discretization method is best, and the papers that describe it explain how the math works.
Thanks for a great follow-up question!
@thohoduc5987 6 років тому
Thanks, Matthew. It is clear to me now. Wonderful!
@MatthewKelly2 6 років тому ⁺¹
I'm glad to help!
@dinesh1286 5 років тому
hi kelly,
how to find f(k+1) in x(k+1)
@Nomolosos89 Рік тому
@MatthewKelly2, thank you so much for your time, excellent explanation. I also liked the presentation, very clean and well organized. I would very much like to know if there is a template for it.
@MatthewKelly2 Рік тому
Thanks! There isn't a template, but you can download the source code (SVG + HTML) for the presentation using the link in the description. It is written in Inkscape, with animations using Sozi.
@scose 4 роки тому ⁺¹
I'm interested in how an adaptive meshing scheme would choose between adding more knots vs. increasing polynomial order in a span of high error. It seems not obvious at all...
@MatthewKelly2 4 роки тому ⁺¹
Good question! The math behind choosing more knows vs polynomial order is tricky. The basic idea is that if the solution is locally smooth, then use a higher order, if it is non-smooth, then use more knot points. The decision is made based on the solution (not the problem itself), which is why these methods require an iterative solution. The expert on the topic is Anil Rao. Here is one of his main papers on the topic: www.anilvrao.com/Publications/JournalPublications/OCAM-13-0135.pdf.
@kvasios 3 роки тому
One question, could someone use Pontryagin's maximum principle or hamilton jacobi bellman eq as a starting point for problem formulation and take it from there or I am missing something? Thanks!
@MatthewKelly2 3 роки тому ⁺¹
I believe so. A "Trajectory Optimization" problem is just solving a special case of the Hamilton Jacobi Bellman (HJB) equations for a single point. In fact, you can use the full solution (optimal policy) from the HJB equations to reconstruct an optimal trajectory. The problem is the "curse of dimensionality" -- usually it is impractical to solve the HJB directly, but it is feasible to compute a locally optimal solution at a single point (trajectory optimization). I believe you could derive any trajectory optimization framework from the HJB equations, but I'm less familiar with the mathematics involved, and find the derivations in this video easier to understand. You can definitely set up and solve trajectory optimization starting from the Pontryagin's maximum principle, and that is closely related to how "indirect" methods for trajectory optimization work. I know less about those techniques, but if you follow the references at the end of the presentation you'll be able to learn more.
@kvasios 3 роки тому
@@MatthewKelly2 Thank you so much for taking the time to answer! I really have a better understanding now... good point about the Pontryagin's max principle and the "indirect" methods indeed.
@pedrocalorio1655 3 роки тому
in the trapezoid method, where you assume that between each data point there is a line connecting them, this isn't a problem in optimization once the first and second derivatives of this function are NOT continuous?
@MatthewKelly2 3 роки тому
Good question, if I'm understanding it correctly. Let me try restating it: How can we use a non-smooth function to represent the trajectory in a nonlinear program that requires everything to be smooth? The key here is that the trajectory (piecewise linear in time) is discontinuous in time. The NLP doesn't care at all about time: it only cares about decision variables. If you carefully look at the gradients, you'll find that the gradient of the trapezoidal discretion with respect to the decision variables is continuous. Another possible interpretation: What if I need the solution trajectory to be continuous with respect to time? Then you can use a higher-order method that has a solution trajectory that is continuous with respect to time.
@pedrocalorio1655 3 роки тому
@@MatthewKelly2 this actually makes a lot of sense, the NLP solver being independent of the time. Thanks once again for the attention! Learning a lot from!!
@chaitanya.awasthi 6 років тому ⁺³
Hi again, Matthew
I was able to implement my own pseudospectral optimal control method thanks to your excellent tutorial. I would now like to extend it to multiphase systems and I am not sure how to go about it. I've looked up literature and there are insightful papers by Betts, Rao, etc. that talk about it and I do understand what needs to be done. But, I just don't understand how to implement it. I understand the linkage constraint requirement, but I have trouble linking the "time" constraint from one phase to the next. I'll really appreciate any suggestions. Thanks!
@MatthewKelly2 6 років тому
Hi Chaitanya, Congrats on getting your own pseudospectral method working! For the multi-phase version, you still need to set up and solve a single non-linear program. Here is one way that you might construct the problem setup for a multi phase optimization: 1) build up the optimization problem for each phase independently; 2) concatenate each of these problems together into a single large problem formulation, being careful to get the index offsets correct for each phase; 3) add the linkage constraints, which are used to connect the phases -- these constraints will use decision variables from two or more phases, which is why you must solve a single NLP, rather than several decoupled NLPs. I suggest that you look at the user guide for GPOPS-II, which provides a nice interface for multi-phase problems. This gives a good starting point for what the interface for your user-defined functions might look like. I hope that this helps! Feel free to ask follow-up questions if you want. --Matt
@chaitanya.awasthi 6 років тому ⁺¹
Hi Matthew,
Thank you for taking time out to reply to my comment. Before I had a chance to look at your comment, I tried doing just that, and it worked! Then, when I read your comment it confirmed my thought process and gave me more confidence in my approach. I did look at GPOPS manual and, like I said, I understood what needed to be done but wasn't quite sure how to implement it (or, maybe I didn't completely understand what needed to be done).
The problem that I used to apply my multi-phase method was an "interior point constraint" problem and it worked beautifully. However, I think that this problem is a little easy compared to an "event-driven" multi-phase problem, which is what I am after (part of my research work). Unlike the interior point constraint, where I know the precise time at which phase change should occur (for example, Problem 3.17 in PSOPT manual), tackling an event-driven problem is relatively more challenging because the timing for phase change is implicit and depends on occurrence of some event. So, to construct a toy example, suppose I have a dynamic model for a state variable V and I want the phase change to occur when V>= 2, does that mean if I use a 20 grid solver, the 20th grid point need to satisfy the constraint of being >=2? If not, how should I approach this problem? I hope my question makes sense to you but I'll be happy to clarify anything about what I wrote. Thank you!
@MatthewKelly2 6 років тому
Hi Chaitanya,
Good follow-up question. You can definitely have event-driven phase transitions. I study walking robots, and a common example is that the heel-strike during walking must occur precisely at a phase boundary. This can be done even if the boundary time is unknown. There are two independent things going on here:
1) How to handle an unknown phase boundary time? The key is to make the duration of each phase a decision variable in the optimization. The time mesh within each phase is then shifted and scaled based on the duration of the phases. Computing an unknown duration is typically "hard", so expect this to slow down the optimization if you have many phases.
2) How to handle a constraint at a phase boundary? You can treat this just like any other boundary constraint. In the walking example, we have an equality constraint that the collision must satisfy linear and angular momentum for the robot model.
--> With an unknown duration, it is likely that you will want to solve the optimization in multiple iterations, re-meshing between iterations. This ensures that your mesh is always doing a good job of accurately representing you system dynamics.
Good luck!
Matt
@chaitanya.awasthi 6 років тому
Hi Matthew,
You've really helped me in getting to the root of the problem, and I think it lies in handling unknown phase boundary time. I was always able to solve single phase final unknown time problems. It involved the same scaling technique that you alluded to. The extension to multi-phase unknown boundary time problem seemed rather straightforward from there. However, this is where I am not getting the correct answer.
So, in order to understand the basics of multi-phase unknown phase time problem, I tried to reformulate the interior point constraint problem so that I could solve it like unknown phase time problem (it really is not unknown phase time, but I created two additional decision variables corresponding to initial and final times within each phase). I know in single phase you scale the dynamics by some unknown "time" parameter which is also your decision variable. I tried doing just that for the two separate phases and then linked them using linkage constraints. However, it turns out that when my dynamics are not scaled, I get the correct answer, but when I do scale my dynamics (as I absolutely should!), I get an incorrect answer. Do you have any thoughts on this? Also, in your latest comment you mentioned "The time mesh within each phase is then *shifted* and scaled based on the duration of the phases." What is the shifted part referring to? Again, thanks a lot for your helpful responses.
@MatthewKelly2 6 років тому
Hi Chaitanya, It sounds like you're putting the scaling in the wrong places. There are two places where you need the time scaling: 1) when you create the time-grid from the decision variables, and 2) when you apply the differentiation matrix in the pseudo-spectral method. This scaling is a simple multiplication and can be verified with a simple unit test.
A linear map has two coefficients (A*x+B), and those can be thought of as shifting (B) and scaling (A).
Good luck!
@trunc8 3 роки тому ⁺³
I watched this twice to soak all the information. Thank you so much for the amazing lecture!
What's the difference between knot points and collocation points?
@MatthewKelly2 3 роки тому ⁺²
Good question! Knot points are the points along the trajectory that separate continuous polynomial sections. Put differently, they are the places where there is a discontinuity in some derivative. This terminology is more broad than trajectory optimization: it comes from polynomial splines. If the spline has N segments, then it will have N+1 knot points, including the boundaries of the spline. Collocation points are those points where the system dynamics are evaluated, and represent the points that are used to discretize the system dynamics. The "collocation error" will be zero at these points for a valid solution to the NLP. The collocation and knot points often overlap, especially for low-order methods. For example, in trapezoidal collocation, the knot and collocation points are identical. In backwards euler discretization, all collocation points overlap with knot points, but there is a single knot point (the first point on the trajectory), that is not a collocation point. For the hermite-simpson method, all knot points overlap with collocation points, but there is an extra collocation point in the middle of each segment that does not overlap with a knot point.
@trunc8 3 роки тому
@@MatthewKelly2 That clears my doubt. Thank you!
@chaitanya.awasthi 7 років тому ⁺¹
Hi Matthew,
Thanks a bunch for sharing this presentation, it was super helpful. However, I do have a minor question. In slide 12, you mentioned that N = number of grid points, but, in slide 15 you said N = number of segments. But the number of segments is always going to be one less than number of grid points (ex. for 10 grid points, you'll have 9 segments). So, which one is it?
@MatthewKelly2 7 років тому
Great catch - that is a typo in my presentation. N on slide 15 should be the number of grid points, and (N-1) is the number of segments (at least for the trapezoidal collocation method). I will add a correction to Slide 15 as soon as I can. Thanks for pointing out the mistake, and I'm glad that it was still able to be helpful!
@chaitanya.awasthi 7 років тому
Thank you for clarifying that, Matthew. But, if N = number of grid points, then the trapezoid quadrature formula on slide 13 looks a bit off to me as k should go from 0 to N-2 (this equation would have been fine if N were the number of segments, but not now). Here is a reference for the quadrature formula wrt which I am posing my question:
tutorial.math.lamar.edu/Classes/CalcII/ApproximatingDefIntegrals.aspx
I'm sorry I am not trying to give you a hard time but I spent several hours trying to find a bug in my code and in the end it boiled down to this. Please correct me if I am wrong. Thank you!
@MatthewKelly2 7 років тому
I'm pretty sure that the equation on slide 13 is still correct. There are N grid points, and (N-1) segments. The summation is over segments, where each segment has an upper and lower grid point. If you look at the summation on the page you linked, you will see that they sum over knot points instead of segments, and that they have a different coefficient for the first and last term in the sum. If you were to expand the summation on slide 13 and then group like terms you will get the same expression. This is because the point w0 will get counted only in the first segment, but w1 will get counted in both the first and second segment. Likewise, all intermediate points will get counted twice and the final point only counted once. Let me know if this makes sense, or if you still think that there is an issue. Thanks!
@user-ot7yc1hv7t 3 роки тому ⁺¹
please how to make a subject in your solver optimal q1^2+q2^2+q3^2+q4^2=1 and q1 q2 q3 q4 are states
@dellpi3911 3 роки тому
ua-cam.com/video/fRyUf-GY754/v-deo.html
@pythonking_stem1527 Рік тому
3:58 @MatthewKelly2, Can you tell me where we use this closed loop optimal control?
@MatthewKelly2 Рік тому ⁺¹
The "closed loop optimal control" problem is traditionally set up by solving the "Hamilton Jacobi Bellman" HJB equation numerically. This is only practical for low-dimensional problems due to the "curse of dimensionality". More recently, there has been a ton of progress made in approximate solutions to this problem using reinforcement learning. There are lots of good resources on that, and I'm not a real expert on RL. The most impressive work that I've seen there is by the Robot System Lab at ETH Zurich, where they used it to control a quadruped robot walking over rough terrain. More details here: rsl.ethz.ch/research/researchtopics/rl-robotics.html
@pythonking_stem1527 Рік тому ⁺¹
@@MatthewKelly2 Thanks for the link...I find your way of teaching very interesting ...
@sasuke90naruto 6 років тому
Thank you matthew.
I am truly grateful to you, for God to bless you. Maybe I will send you an email for any references
@MatthewKelly2 6 років тому
Domenico Laviola - I’m glad that you found the video helpful! There are some references at the end of the video and also in the long tutorial paper on my website:
www.matthewpeterkelly.com/research/MatthewKelly_IntroTrajectoryOptimization_SIAM_Review_2017.pdf
Feel free to email me if you’re looking for something more specific.
@YOUnoobGER 3 роки тому
@@MatthewKelly2 Thanks for making the SIAM Review public!
@JeffryHartanto88 5 років тому
How do you get the grid points? Thanks..
@MatthewKelly2 5 років тому
Typically you would start with a small number of uniformly spaced grid points, with the exact number determined by heuristics or trial and error. Once you've got that working, then you can dive into using mesh refinement, adding more points where there is larger collocation error.
@juliobetancourtvera4644 6 років тому
Hi Matthew,
This is a great work to all the community who wants to start or even if they have the basis on optimization problems (optimal control). In my case to start learning a lot of things about optimal control. I very interested in work with this. Actually I work with MPC (Model Predictive Control) and I implemented in a Quadcopter (ArDrone Parrot) but without constraints.. The essence of MPC is to deal with constraints but as you know, deal with constraints can be hard in a computational manner, specially in a embedded system.
If you can give an advice to deal with this computational burden I will appreciate. And again, thanks for your great job.
@MatthewKelly2 6 років тому
Restating your question: "how do you get trajectory optimization to run fast enough for real-time applications?"
There is no general answer that I know of, but there are several guidelines:
1) Make sure that you have an excellent initial guess (eg. seed current trajectory with previous in MPC).
2) Avoid nonlinear functions. If your initial guess is very good then you can linearize your dynamics and constraints to turn the nonlinear program into a quadratic program (assuming that you have a quadratic objective). There are very-fast sparse quadratic programming solvers. You need to do some extra work to verify that the solution is within the still valid given the linearization.
3) If you only deal with kinematics (rather than dynamics), then you can typically write the problem as a sparse quadratic program (with inequality constraints) directly, without worrying about linearization.
@vivekyadav-zs8wg 7 років тому ⁺¹
I am teaching Direct collocation as part of my course, can I use some material from here for my class?
@MatthewKelly2 7 років тому ⁺¹
vivek yadav - thanks for your interest! You may use my material for your class. Please be sure to include a citation to whatever you do end up using. Thanks for checking, and good luck with your class!
@vivekyadav-zs8wg 7 років тому
Thanks, I included links to your work, and this talk in notes. Its in the additional resources section at the bottom. Please let me know if there are any issues.
mec560sbu.github.io/2016/09/30/direct_collocation/
@MatthewKelly2 7 років тому
Hi Vivek - your course page looks awesome! Nice job. I skimmed through it quickly, but I hope to read it more carefully later.
@sidkshatriya 6 років тому ⁺¹
I loved this video. Thank you! One major question I had was: how is it possible for error to be zero at knot points in your error graphs as in slide 15 you are using approximate dynamics i.e. x_k+1- x_k = h/2 …
@MatthewKelly2 6 років тому
Great question: they key is in understanding what is meant by error. There are several different definitions of error. In this case I am referring to the error in the differential equations at a specific time: e(t) = xDot(t) - f(t, x(t), u(t)). If the NLP were to precisely satisfy its constraint equations, then this particular definition of error would be precisely zero at each collocation point. It would be non-zero elsewhere due to the approximate solution of the differential equations via quadrature. The total error in the solution to the differential equations between two knot points is obtained by computing the integral of the absolute value of the error in the differential equations over that same interval. This can then be used to compute a new set of mesh points. Let me know if this makes sense.
@sidkshatriya 6 років тому
Thanks Mathew for your reply. I'm still a bit confused. Error to me is a difference between the true state of the system and the computer calculated (or in this case NLP solved) state of the system. Now because your constraint equation x_k+1 - x_k = (f_k+1 + f_k) h / 2 is an approximate dynamics, x_k+1 and x_k given by the NLP solver will always either undershoot or overshoot the real life system value (except of course at the boundary where there will be no undershoot or overshoot because they are user specified). Now you could always simulate the system forward from x_k onwards towards x_k+1 by using a finer mesh and calculate the deviation from the (implied linear) interpolation. But the fact remains that even x_k and x_k+1 found by solver are both approximations (am I correct here?). So you're getting an idea of an error in a quantity that already has an error? I hope I'm making sense.
@MatthewKelly2 6 років тому
You're correct: the error would be the difference between the true solution and the current solution. The trouble is that we do not know the true solution. Instead, we must approximate the true solution. One way to do this is to observe that on the true solution the dynamics will be precisely satisfied at every instant, not just at the knot points. From this, you can construct a local error estimate between each pair of knot points: assuming the the lower knot point is correct, how much error do we think was accumulated by errors in the system dynamics over this segment? That quantity is estimated by numerically integrating the error in the differential equations between those two knot points. I suspect that you could approximate a global metric by accumulating the local errors, but I'm not sure on this. The method that I've outlined here is the method suggested in Bett's book, although he references some other ones as well. I hope that this make it a bit more clear!
@sidkshatriya 6 років тому
> From this, you can construct a local error estimate between each pair of knot points: assuming the the lower knot point is correct, how much error do we think was accumulated by errors in the system dynamics over this segment?
Thank you. Things are clearer now with your last clarification. What threw me off was you showing on slide 15 that error was zero at the knot points. You've clarified that the the error calculation is based off an approximate estimation. You mentioned that you're assuming the first knot point to have zero error from true system value and then simulate from that point and measure error accumulation as we go along towards next know point. This gives us an estimate of the parts of the system that are subjected to high rates of system change and there we can re-mesh as I understand it. But that again begs the question, why does your error decline to zero again as we approach the next knot point. Should it not be like a sawtooth -- error rising as we go from left to right towards the next knot point and then restarting from zero again? Unless we have error being calculated forwards and backwards from the knot point ahead and behind.
Just for your information, I came to your video about trajectory optimization after reading about some impressive stuff on Contact- Invariant Optimization by Igor Mordach et. al in which animal like characters motions are automagically found by the solvers. Wow. And then your video helped everything fall into place conceptually. Thanks for introducing me to this powerful way of doing things.
@MatthewKelly2 6 років тому
I'm glad to have made things a bit more clear. It's good to know that you managed to find my trajectory optimization paper! It is published by SIAM, and is available on their website: epubs.siam.org/doi/pdf/10.1137/16M1062569
In the video I talk about two types of error: we use the error in the differential equations to estimate the local error within each segment. The error in the differential equations is not a saw-tooth because there is no notion of time moving forward or backward. The local error estimate is a single number for each segment, so there is not pretty plot to show for it, although I guess that a bar chart would work.
The paper by Igor Mordach great. You might also like the two papers by Michael Posa and Russ Tedrake about through-contact optimization. Best wishes!
@wallee39 6 років тому ⁺¹
Hi Matthew,
This is a wonderful tutorial. Thank you for putting things together. Can you tell me how to handle trajectory optimization for discontinuous dynamics?
@MatthewKelly2 6 років тому
Hi Walelign,
I'm glad that you found the tutorial helpful! I've outlined a few methods below. The best method for you will ultimately depend on the system dynamics.
1) Multi-phase methods. Model the system as a finite-state machine, then pose a trajectory optimization problem for each continuous phase. Then add a boundary constraint representing the transition between each state. Works best when the transitions between continuous phases of motion are happening in a known order and there are few of them. Look up GPOPS-II and Anil Rao for some more technical details on this type of method.
2) Through-contact methods. This is primarily used for systems where the discontinuities are due to contacts, such as a foot hitting or leaving the ground. You treat the complementarity constraint for the contact model as path constraints in the trajectory optimization. This works well when the sequence of contacts is not known a priori. Look up {Michael Posa and Russ Tedrake @ MIT} and {Igor Mordatch and Emo Todorov @ U. Washingon} for more details.
3) If the discontinuity is some basic function, such as abs(), max(), min(),... then it is sometime possible to use a slack variable and additional constraints to push all of the discontinuities into constraints. This is commonly used in linear programming. I learned about it from "Practical Methods for Optimal Control and Estimation Using Nonlinear Optimization" by John T. Betts.
4) If the discontinuity is some other small function that is not easy to handle with slack variables, then it is often possible to make a smooth approximation of it. For example, replacing the sign() function with a sigmoid function. Look up "exponential smoothing" for common examples.
I briefly discuss all of these methods and detailed references in my two tutorial papers on trajectory optimization:
www.matthewpeterkelly.com/research/MatthewKelly_IntroTrajectoryOptimization_SIAM_Review_2017.pdf
arxiv.org/pdf/1707.00284.pdf
Matt
@wallee39 6 років тому
Matthew, Thank you so much! This helps a lot. Really appreciate it!
@MatthewKelly2 6 років тому
You're welcome - I'm glad to help!
@pythonking_stem1527 Рік тому
43:00 is it dirk hall?
@yuelinzhu9800 2 роки тому
Thank you, It's still very helpful right now. I also have a question, why we also use the integration of torque(or input) as the objective function?
What is the basis for doing like that? is it just like (let F=ma, f=1,m=1,the power is P=FS=a*1/2a*t^2). Thank you Kelly, if you have time to answer my question.
@MatthewKelly2 2 роки тому ⁺¹
There are some more-technical ways to show that the "integral of control effort squared" is a good objective function for second-order dynamical systems, but here is an intuitive explanation. The quadratic cost means that it is "expensive" to use a large control effort. Let's imagine that the controller wants to deliver a fixed impulse. If the cost was linear, then there would be no reason to prefer a 0.1s * 10N impulse vs a 1s * 1N impulse. With a quadratic cost the 1s * 10N impulse is "less expensive". Overall, this means that the "control-squared" objective tends to produce smooth solutions. This is important because the discrete approximation of the trajectory relies on the assumption that a polynomial spline can approximate the solution... which is only true if the solution is "smooth". It also happens that quadratic cost functions have really nice gradients, which helps the optimization numerically.
@yuelinzhu9800 2 роки тому
@@MatthewKelly2 Hi, Kelly, Thank you for your clear explanation, it's very helpful for me.
@yuelinzhu9800 2 роки тому
@@MatthewKelly2 Hi, Kelly, I also have another question, can we just understand the closed-loop solution is a global solution?
If yes/no, can you give me simple reference material? Thank you very much.
@MatthewKelly2 2 роки тому
@@yuelinzhu9800 - The type of trajectory optimization explained in this video can be thought of as a local solution, which could be evaluated to produce an open-loop controller. If you want the full global solution then you want a different technique. Look up "solving the Hamilton Jacobi Bellman (HJB) equations. You'll find that they become intractable for most non-trivial solutions. The general approach is to use reinforcement learning to approximate the global solution. This is an active area of research, and I'm not an expert -- lots of neat stuff to learn there. Another related thing is model-predictive control. That produces a local closed-loop controller (rather than open loop). The idea is that you continually solve a trajectory optimization (such as the cart-pole) swing up from the measured state, which then allows you to feed-back and stabilize errors. This is difficult to implement in practice because you need to solve the optimization very quickly (in real time, in your control loop). That being said, it can be done, once you get all of the details right.
@S255fjrbr 3 роки тому
Does the method for the numeric integral and the solution of the ODE always have to be the same? I.e., can I approximate the integral using the trapezoidal method and solve the ODE using a foward Euler scheme? If not, why not?
@MatthewKelly2 3 роки тому
When solving a trajectory optimization problem the ODE is a constraint. The accuracy to which the ODE is solved by the trajectory optimization is a function of (1) your discretization method and (2) your convergence tolerances in the nonlinear programming solver. Suppose that you have an exact (eg. analytic) solution to a trajectory optimization problem. Then if you start at the initial conditions and use an ODE solver (giving it the exact control solution as a function of time), then the ODE solver will reconstruct the state trajectory (within the numerical accuracy of the ODE solver). Put another way: the trajectory optimization contains a full ODE solver. It just happens to be that trajectory optimization is typically done with an implicit integration scheme (eg trapezoid), where as most "simulators" would use an explicit scheme (eg. euler's method).
@S255fjrbr 3 роки тому
@@MatthewKelly2 First of all, thank you very much for the quick reply! I am aware that in this numerical optimization the ODE contains a full solver by enforcing the discretization as equality constraints at each time step. I also understand that choosing an implicit scheme does not come at the "usual" cost of solving a system of nonlinear equations before calculating the next state because in the optimization context, the information about the "next" states/inputs is inherently part of the optimization as a decision variable, correct? I was wondering more about how the ODE discretization is connected to the quadrature for the running costs l(x,u). If I choose to integrate the ODE with a trapezoidal scheme, do I have to use that SAME scheme for the quadrature calculation of the cost function? If that is the case, why is that? Can't I use, say, Simpson's rule to integrate the running costs while using the trapezoidal scheme to integrate the ODE. Or, for example, can I use the trapezoidal scheme to evaluate the cost function and integrate the ODE with a forward Euler scheme? I hope I could explain myself more clearly.
@MatthewKelly2 3 роки тому
@@S255fjrbr I think that I understand now. Good question. The "correct" way to do it is to always choose a consistent integration scheme for the system dynamics and the objective function. That being said, you could certainly set up different integration schemes and still get reasonable answers (I think...). I don't know the precise details for why this is the case, although I suspect that it has to do with making reasonable predictions about the accuracy and convergence of the discretization scheme itself (eg. do you have enough knot points in the right places).
@S255fjrbr 3 роки тому
@@MatthewKelly2 Thanks again. I also can't come up with an answer. Maybe it becomes more clear if a problem with final and running costs is considered. The "accuracy" of the final costs computation would then be linked to the accuracy of the underlying ODE integration scheme while evaluation of the integral of the running costs would result in a different accuracy, e.g. you integrate the ODE with forward Euler and the integral using the trapezoidal approach, you would add two values of different accuracies which does not seem to be a good idea. I have another question, though. If I were to choose the forward Euler scheme to approximate both ODE and integral costs, I would end up with a discretized cost function sum_{k=0}^{N-1} l(x_k,u_k). But as can be seen, this results in a cost that is independent from the final state/input (x_N,u_N). This is not really intuitive, I think, and it is also independent from the input interpolation because the forward Euler does not "care" about what happens in between the discretization steps. Am I missing something?
@MatthewKelly2 3 роки тому
@@S255fjrbr Good observations. You're right: the forward Euler discretization would not include the terminal state directly in the running cost. You'll see similar artifacts show up in other methods as well. In theory, these are accounted for by the fact that your dynamics will prevent any large jump in the state at the boundaries. I suspect that this is another place where careful study would lead to some interesting nuances and understanding of the details of trajectory optimization.
@pedrocalorio1655 3 роки тому
How to handle tractory optimization, or any gradient-based optimization when the input law is discontinuous via an if statement?
@MatthewKelly2 3 роки тому ⁺¹
It depends what you mean by "input law is discontinuous". Let's be more specific. If the objective or constraint function has a discontinuity, then you've got trouble. This will cause a jump in the gradients that will break most continuous optimization solvers. The general "trick" is to reformulate your problem in such a way as to inform the solver about the discontinuity and let it manage it via switching the set of active constraints. This is often done using slack variables. A common example is to replace an absolute value function with two positive slack variables and an equality constraint, as is described in Bett's book (see references at end of presentation or in my journal paper linked in the description). I
If the discontinuity occurs in the solution of the trajectory optimization, which can happen for some smooth optimization problems, then you can also have problems. The NLP will often converge, but your discretization will often be bad. The solution here is to iteratively adjust your mesh, this can be done manually, or through adaptive meshing. See the research by Anil Rao et al. on HP adaptive meshing that is used in GPOPS for a concrete example and thorough analysis.
@pedrocalorio1655 3 роки тому
@@MatthewKelly2 thank you so much for your complete answer!!
However, I don’t know if I’m mixing things here, but the slack variables is part of the KKT conditions for optimality, right?
And what if I cannot reformulate my NLP to make it continuous for my feasible set, this means that I’ll have to go for stochastic methods of optimization?
Sorry for coming with more questions....
@MatthewKelly2 3 роки тому
@@pedrocalorio1655 Some NLP solvers use slack variables internally, but you don't need to worry about that. The slack variables that I'm talking about would be added directly as decision variables while formulating the NLP, typically as extra state or control variables. The `minimumWork/MAIN_cstWork.m` example in my OptimTraj software shows a simple example of using slack variables to represent an absolute value function. Similar tricks can be used for min() and max(). There are also various ways to "smooth out" functions and interpolation from tables. Iterative methods can be made "smooth" by using a fixed number of iterations.
You can use stochastic optimization, but I've never gotten that to work reliably. The problem is that trajectory optimization requires a large number of equality constraints for the defects, and stochastic optimizer have a really hard time with equality constraints. Put differently, the whole idea behind collocation methods (like I describe here) is to put the optimization into a format the makes it easy for sparse gradient-based solvers to optimize. If you switch to a stochastic solver, then you would probably want a different formulation entirely, something closer to single shooting (which has its own set of challenges).
@LuisFRossi 7 років тому
Hello again, I am studing multi-phase trajectory optimization for biped gait and I started with the Compass-Gait. I was able to create a trajectory optimization for a single step with different conditions and constraints. Now I am looking for a way of optimizing 2 steps in a row, each step with different constraints (and maybe objective function if possible). Is this possible? Probably multi-phase trajectory optimization would be the best solution for my case, but still looking for good exmples. If you have any recomendation, would be great. After some work, basically the complexity of this problem is that that the final state of the first step must satisfy the constraint of enabling a second step that can satisfy its constraints.
@MatthewKelly2 7 років тому
You can definitely create and then solve a multi-phase trajectory optimization problem for a sequence of walking steps. The most that I've ever done was 5 steps (phases) with a simple walking model (point mass, extensible legs).
To connect two steps, you simple place a boundary constraint that connects the two phases of motion. This usually involves something to make sure that the minimal-coordinate representation of the state matches up between steps.
Changing the objective function between steps is fine, as long as it is constant throughout each individual phase.
As you add more phases, the problem will become more difficult to solve, and you will be more likely to get stuck on bad local minima. You might need to be clever about constructing feasible initial guesses for each step.
I have a two-phase optimal gait demo for a simple model of walking on GitHub:
github.com/MatthewPeterKelly/TwoPhaseOptimalGait
The code is written in Matlab, and you will need GPOPS-II to run it. Even if you don't have GPOPS-II, you might find the code useful to see how I set up the constraints to link different phases of motion.
@LuisFRossi 7 років тому
Thank you,
Reading your example, If i understood it correctly, I think there are afew pieces of informations missing to implement my problem. In my case the I have two different pahses with the same dynamics and a impulsive change of states between phases (the Compass-Gait Impact with the floor). I am wondering how can I Implement that Impulsive change as it will requere an if statement and strongly non-linear? Any advice on that?
@MatthewKelly2 7 років тому
The key idea is that all "if statements" that are in your code must be implemented with a constraint. In this case, it is relatively simple. Put a constraint on the final state so that it is in a valid heel-strike configuration (foot is on the ground). Then put a constraint that the initial state is equal to the final state, after applying the heel-strike map. I believe that the simpleWalker demo in my OptimTraj code (below) does exactly what you're asking. It not, then perhaps I miss-understand your question.
github.com/MatthewPeterKelly/OptimTraj/tree/master/demo/simpleWalker
@LuisFRossi 7 років тому
Thank you for your fast reply,
It is a bit different. That specific solution will give me a Limit-Cycle, but that is not what I am looking for. I want to generate 2 steps with different constraints or objective functions . Lets say I want that the compass-gait have in the first step an average velocity of 0.5m/2 with any step size and in de second step a step size of 0.5m with any velocity . I would like to generate the trajectory of both steps, but they must be solved simultaneously as the final state of the first step must lead to a condition that enables the second step to be feasible. If I generate them separated the first step might not end in a condition that is feasible for the second step. Is this possible?
@MatthewKelly2 7 років тому
I understand now. The concept is very similar to the limit cycle.
Any time that there is a collision, you must have a boundary between two continuous phases of motion:
-- Phase One: first walking step
-- Phase Two: second walking step
Each phase has two boundary states: one for the beginning and one for the end. You will need to constraints between the two phases:
-- ensure that the system is in a valid configuration (ie. foot on the ground)
-- ensure that the discrete heel-strike map connects the final state in phase one to the initial state in phase two.
You can then apply whatever constraint and object you want to each phase. For example, to achieve a fixed step-length in phase one, you simply add an additional constraint between the initial and final state in phase one. To achieve a fixed average velocity in step two, you simply add a constraint between the initial and final state in phase two.
This cannot be done easily with my OptimTraj package, since it only supports single phase problems, but it is not too difficult to code up in other packages, or just using your own collocation method.
@jingjiejiang9246 6 років тому
Hi, Matthew,
thanks for your nice and helpful video. I just got the idea to solve my optimization problem for a fleet of AGVs' trajectory planning. However, in my framework, those AGVs could only move in one direction along a grid-shape path. This constraint seems to be a if-else problem, could you please introduce me some references? Or do you have any other idea to transform this discontinue constraint into continue one?
I appreciate your help!
@MatthewKelly2 6 років тому
The type of trajectory optimization in this tutorial is specifically focused on working with continuous dynamical systems. There are much better optimization techniques when you have a system that is partially or fully discrete. I know less about those techniques, but I'll try to give some helpful suggestions.
How complicated are the dynamics of your system along the grid-paths? For example, can you directly control speed, or do you need to do control in the world of accelerations or forces?
Let's assume that the kinematics of you AGVs are trivial, and the true challenge is in determining which path to take on the grid. Then you can fully discretize the problem, transforming it into the following: given my current location at intersection A, which road should I take? This becomes an optimal graph-search problem. You can use A* or Dijkstra's method to solve it. If things are a bit more complicated, then you can model the problem as a Markov Decision Process and then use value iteration to compute the optimal policy at every single point in your grid-world. This has the added benefit that you can have a single policy that all of the AGVs can follow. You might have to be a bit careful about collisions though.
I hope that this is helpful!
@jingjiejiang9246 6 років тому
+Matthew Kelly Actually, the origin dynamic is continuos - a simple acceleration-controlled problem. And if you have some suggestions on this analytical methods considering different constraints, that is optimal. I have also discretized model which I could use if no analytical solutions. And I already have the graph-based method. Anyway, it would be my best goal to solve it analytically. By the way, do you know how to make IF-ESLE to continous constraints?
@MatthewKelly2 6 років тому
There are ways to handle if-else constraints in trajectory optimization, but they do not scale well to problems like the one you are describing (at least as far as I know). For example, you can use multi-phase optimization with parallel phases. Within a single phase you can handle if-else conditions by pushing them to the constraint solver --- this is how through-contact optimization works (look up the paper by Michael Posa and Russ Tedrake).
It might be possible to use a hybrid approach: use a continuous-time trajectory optimization to compute the optimal solution on each edge of the discrete graph. For example, between each point where you stop. If you want to travel straight through an intersection, then make a new edge that includes the two roads on either side of the intersection. My guess is that this is the best approach, but there might be something else out there.
@jingjiejiang9246 6 років тому
Matthew, sincere thanks for your great help! I little bit doubt the idea of the hybrid approach, since the first stop for each AGV is different, i.e. different number of time steps to reach, how could put the whole into one optimisation problem? And could you please give me some references for time optimal control fitting into my problem for discrete system and continuous systems, respectively? Or do you think it is difficult to tackle because of the IF-ELSE constraint?
@MatthewKelly2 6 років тому
Your proposed problem is definitely a tricky one to solve, but certainly interesting. It is generally a "hard" problem to do path planing and minimum-time optimization at the same time, especially when multiple vehicles are concerned. As best I know the general approach is to decouple the problem, solving the path planning first and then computing the time-optimal solution as a sub-problem. That being said, I do not know that literature very well, so someone else may have done it. The if-else constraint is manageable on its own, but in the context of a larger path planning and trajectory optimization problem it will be very tricky. I would suggest doing a literature survey to first understand the problem if the kinematics were simple (eg. distance is approximately time). Then figure out how to modify it to handle the more complicated dynamics. One paper that you may find interesting is "MInimium-time speed optimization over a fixed path" by Thomas Lipp and Stephen Boyd. Good luck!
@vitalram1985 6 років тому
Dear Mathew, It is so nice video. It is the only video of its kind. Can you share any information of trajectory planning of 2R manipulator or any other manipulator using the direct collocation. I found few papers, bu they are very abstract to application.
Thank in advance
@MatthewKelly2 6 років тому
Thanks! Take a look at my trajectory optimization software for Matlab: OptimTraj, which is linked from the video. One of the examples is for a double pendulum, and another is for a simple walking model. Both are using direct collocation for a R2 manipulator. Take a look at my tutorial paper, also linked, if you want more details about the mathematics of the transcription method.
@vitalram1985 6 років тому
Thanks again Mathew, for your reply. I will look into the suggested material and ask you if I need any further clarifications.
@LuisFRossi 8 років тому
Hellow Mathew. If I am looking for a minimum time trajectory, do you have any references
@MatthewKelly2 8 років тому ⁺¹
All of the material in the video applies for minimum-time trajectory optimization. You just need to set the integrand of the objective function to 1 and make T a decision variable. The list of references at the end of the video and in the pages linked in the description also apply.
You might find some useful references by looking up "the brachistochrone problem" - it is the cannonical minimum-time problem.
There is a related, but much harder problem: Find the minimum time trajectory that passes through specific way-points. I know much less about this field. This is similar to the problem in the video, except that the duration of every segment must be a decision variable (or it must be a multi-phase optimization with many phases). This makes the non-linear program not-well-behaved as posed here. There are a variety of methods for solving it. Look up "time-optimal path parameterization".
One paper that I've run across is: arxiv.org/abs/1312.6533 "A General, Fast, and Robust Implementation of the Time-Optimal Path Parameterization Algorithm". Another paper is: "Time optimal path planning considering acceleration limits" by Marko Lepetic et al.
@LuisFRossi 8 років тому
Thank you for your answer. Why changing the objective function to 1 would work? Shouldt I change that to t? I know there is an integration and the number of points would sum up. But the number of point used on the discretization might change.
@LuisFRossi 8 років тому
I tryed to change your Matlab example of the cartPole but it is not working. I changed the Objective Function to (ones(size(t))*max(t)) (basically a vector of size of t with all elements with max(t) so the trapezoidal interpolation works properly). Changed the problem.bounds.finalTime.low to 0. However the systems converge to a complete insane solution using almost no force in almost no time. The solution does not seems to be correct. Any help on that?
@MatthewKelly2 8 років тому
The "path objective" function contains only the integrand of the objective function. The actual quadrature (integration) is handled by the transcription method. The integral of the function "1" is always equal to the difference between the limits of integration: in LaTeX: $ int_0^T (1) dt = t \bar_0^T = T - 0 = T $
If you were to use t as the integrand instead of 1, it would mean a different thing: you would minimize time with a time-varying weight on that time. It would also work, but be very different than minimum time.
@MatthewKelly2 8 років тому
You integrand is wrong: it must be ones(size(t)) and not (ones(size(t))*max(t))
I added a minimum-time example to OptimTraj:
github.com/MatthewPeterKelly/OptimTraj/blob/master/demo/cartPole/MAIN_minTime.m
As a rule, you should never use the max() function inside of an integrand. The function is vectorized for speed, but every element should be evaluated using the same exact sequence of arithmatic operations and be independent of the others.
The minimum-time objective function will result in much less well-behaved solutions than minimum-force, since minimum-time trajectories are always bang-bang solutions, and thus discontinuous. The discontinuity in the solution causes problems for convergence with direct transcription methods, although it will eventually converge if you're careful with the mesh refinement and problem set-up.
@pythonking_stem1527 Рік тому ⁺¹
14:17 Can you give some reference on indirect methods?
@MatthewKelly2 Рік тому ⁺¹
Good question! I actually don't know any off the top of my head. I would start by looking into the references of the references at the end of this video. Let me know (or reply here) if you find a good one!
@pythonking_stem1527 Рік тому
@@MatthewKelly2 sure thank you for the response
@leekarlson5348 8 років тому
what do you think about PyGMO?
@MatthewKelly2 8 років тому
It looks neat! This is my first time hearing about it.
It seems like PyGMO is a library of (mostly) parameter optimization algorithms. Most of my talk here is about how to convert a trajectory optimization problem into a parameter optimization problem. That being said, my guess is that someone will end up implementing a proper trajectory optimization method in PyGMO at some point.
Libraries like PyGMO are great, because the allow for a standard interface for solving a single problem with many algorithms. This was a key concept in most of my learning about optimization. For example, when writing genetic algorithms, we always had to demonstrate that our GA outperformed both a random search and simulated annealing. Another interesting example is to compare PSO to CMAES for different problems.
Let me know if you end up playing around with PyGMO for trajectory optimization!
@leekarlson5348 8 років тому
Well, I am currently using a N-body simulator as a simulation to do optimisation with the extra body to act as a rocket and using PyGMO to do the optimisation. After watching your video, now I know this is not the way forwards...... But fortunately, my silly implement got some reasonable solutions by using shooting, so was wondering what you think? so basically my decision variables are the thrust of the engine (delta v x , delta v y, delta vz) each month for 360 days for a trip to Mars. Currently, what pygmo is doing is first generating lots of random solution according to the bounds and maybe some are fesibles and then it refine the solution using a local optimiser. I am using Monontnic Basin hopping as the global optimisation
@MatthewKelly2 8 років тому
First, a word of caution: my expertise is mostly in manipulator planning: robotic arms and legs. That being said, I have a few ideas that might be of interest.
--> [1] I would suggest that you read the 1998 paper by John T. Betts: "A survey of Numerical Methods for Trajectory Optimization", as it is largely focused on aerospace planning. Many of the references for that paper are related to aerospace planning, which might be of use.
--> [2] It seems to me that there are many local optima to this sort of problem. Typically getting a good initialization is domain-specific. It seems to me that this is what you would be using the monotonic basin hopping for.
--> [3] The "correct" way to pose such a problem, assuming that you have a reasonable guess, would be to use a multiple shooting approach. I would start with a direct method, and then move to an indirect if you need better accuracy estimates. Since you can decouple the motion of the planets from the space craft, you can (should) solve the motion of the planets ahead of time and then treat the problem as time-varying, with a fixed duration. Then, for your multiple shooting approach, use the same integration / physics scheme for the space craft motion as you used for the simulation of the planets. Both should be fixed-step schemes, or at least use the same time-grid every time. The decision variables would just be the rocket state and control at the start of each month. I suggest multiple shooting because you have a relatively simple control (zero-order-hold) over a large time span, where the system dynamics change much faster than the control.
--> [4] There are many other ways to pose this problem that will give reasonable results. Typically these methods will either be less accurate or take much longer to run, but they might be easier to program. Simple methods are more likely to work for this problem because the control is relatively low-dimensional. Also, all of the suggestions both in (3) above and in my talk are based on the idea of using a smooth non-linear programming solver, such as SNOPT, IPOPT, or FMINCON, all of which have numerical problems if the problem is not perfectly smooth. Other optimization methods such as CMAES, PSO, Simulated Annealing, Genetic Algorithms work just fine on non-smooth problems, although they are much worse at handling constraints. For example, I've often used CMAES for finding low-dimensional robust controllers, by optimizing over a monte-carlo simulation of the robot.
--> [5] At the end of the day, if you get a solution that is feasible and sufficiently optimal, then you can call it a success! Using the more formal methods should get a more accurate answer in less CPU time, but it might not be worth it if you have another method that already works well enough. Another thing to consider is that a big part of this problem is getting a reasonable initial guess, and that with a good enough initial guess you might just be able to use a simple single shooting method in any non-linear optimization. Let me know if this all makes sense. Good luck!
@shihaowang152 6 років тому
Cannot love this video more
@MatthewKelly2 6 років тому
Shihao Wang Thanks!
@nicosharalabidis5286 6 років тому
Good afternoon Matthew,
This is an outstanding presentation for the topic area, especially for somebody new to the topic! I have also found your dedicated web page material to be invaluable as I have begun my quest to learn about direct collocation.
In the Matlab toolbox examples you have provided, you have managed to derive the equations of motion analytically and expressed them within Matlab symbolically.
I am currently in a situation whereby I am using another piece of open source software called OpenSim (opensim.stanford.edu/), which is a musculoskeletal modeling piece of software capable of performing biomechanical simulations (e.g. walking gait). This software is unable to provide me the equations of motion in state-space form, instead I am capable of determining x^dot by calling a built in function from known states and controls.
I was therefore wondering whether you had any experience of 'setting up the problem' in such a manner?
If my question was too ambiguous I apologize - I can try to explain it in a more proficient manner.
Kind regards,
Nicos Haralabidis
PhD Student
University of Bath
@MatthewKelly2 6 років тому
Hi Nicos, This is a great question. I'm familiar with OpenSim, although I've never used it. There are two potential issues that you might run into using a physics engine inside of a trajectory optimization problem:
1: The key reason to use physic simulators is to handle complicated dynamics equations, especially those with contact (eg. foot hitting the ground, hand grasping an object...). Handling contacts inside of the dynamics like this will typically break the optimization, because it changes the sparsity pattern in the jacobian of the constraint vector between iterations in the NLP solver (eg. FMINCON). There is a way to handle this "correctly", but it basically works by using the optimization to compute the contact sequence automatically (look up: "through-contact trajectory optimization", but Michael Posa and Russ Tedrake). This will likely not be possible in your case. You might try it anyway: there is a chance that it will just work, or converge slowly but give a reasonable answer.
2: The bigger problem here is likely to be the format of the dynamics that the simulator produces. If the simulator gives you the full state of the system at the next time step, then it will be difficult to put into a standard trajectory optimization solver. This is because these solvers expect a "continuous" form of the dynamics, where the dynamics function returns the derivative of the state, rather than the next state. There is a solution here, but it requires that you write your own transcription method. Rather than solving the dynamics yourself (eg. numerical integration), you call the physics engine. Think of this like using the trapezoid collocation equations, but replacing the collocation equation with: "state at the next collocation point" == "simulator(state at the previous collocation point)". I've never tried this, but it should work if the simulator is smooth.
Good luck - I hope that this helps!
Matt
@nicosharalabidis5286 6 років тому
Hello again Matthew,
Thank you very much for your detailed reply. It has taken me a while to reply as I wanted to fully comprehend your reply.
1. "Handling contacts inside of the dynamics like this will typically break the optimization, because it changes the sparsity pattern in the jacobian of the constraint vector between iterations in the NLP solver (eg. FMINCON)". Please can you explain why it will change the sparsity pattern of the Jacobian between iterations? By the constraints are you referring to the system dynamics constraints? Or all of the possible constraints? (e.g. the system dynamics, equality and inequality constraints?).
Thanks for the reference to the reading material - I will make sure to read it thoroughly!
2. I think I am in a fortunate position whereby I am able to obtain the derivative of the states at any of the time steps (current or next). I am then in a position to evaluate the system dynamics constraints by using an Euler or Trapezoidal discretization scheme.
I also have another question. In the case of walking gait, for example, I want to find the muscle controls (and thus the joint trajectories) responsible for a human walking from 0 m to 10 m with a free final time (this is a hypothetical example). In reality, humans would make several ground contacts to produce ground reaction forces responsible for accelerating their COM towards the 10 m mark. How would a problem of this type be solved using direct collocation? How would the simulation know to make contact with the ground? Would it be related to the way in which the constraints (equality or inequality) are set? I expect that this sort of problem is typical within the robotics literature, do you have any further reading suggestions?
Sorry again for the lengthy message.
I look forward to your reply.
Kind regards,
Nicos Haralabidis
@MatthewKelly2 6 років тому
Hi Nicos, here are some quick answers, roughly by number:
1) Why does change in contact cause a change in the sparsity pattern? Imagine that you have a one-dimensional hopping robot. The robot is in contact with the ground at time-step k. The gradients show that the robot can apply a force to push off from the ground, causing an acceleration. On the next iteration, the solver determines that the robot is now in the air. The gradients now show that there is no force that can cause an acceleration. The gradient between the control and the vertical acceleration went from being non-zero to zero in a discontinuous way between two iterations. This will break typical NLP solvers.
2) Good! Sounds like you're off to the races with through-contact trajectory optimization.
--> You will find that computing 10 meters of walking for a full bipedal model will make for a challenging trajectory optimization problem. Typically people compute one or two steps with the full model. Longer motions are typically done on a reduced model. That being said, there is no hard reason that it cannot be done. Use through-contact optimization to handle the uncertain contacts. You can let the final time in the optimization be a decision variable. Both of these things are possible using OptimTraj, although you would need to be clever about implementing the through-contact optimization I think. The two papers on through-contact optimization by Michael Posa and Russ Tedrake are likely to be your best bet, although I suspect that a few people from Stanford have done optimization with OpenSim, I just don't know them off the top of my head.
Good luck!
Matt
@yosinhu5952 2 роки тому
国内有伪谱法的教程吗？想用cpp实现一下，国人请联系我，可有偿

Наступне

Автоматичне відтворення

[Tutorial] Optimization, Optimal Control, Trajectory Optimization, and Splines