PySINDy: A Python Library for Model Discovery

Поділитися
Вставка
  • Опубліковано 5 лют 2025
  • github.com/dyn...
    PySINDy: A Python package for the sparse identification of nonlinear dynamical systems from data
    by Brian de Silva, Kathleen Champion, Markus Quade, Jean-Christophe Loiseau, J. Nathan Kutz, Steven L. Brunton
    Journal of Open Source Software, 5(49), 2104, 2020
    doi.org/10.211...
    This video was produced at the University of Washington

КОМЕНТАРІ • 86

  • @pipertripp
    @pipertripp 4 роки тому +50

    The more I watch these really cool model videos the more obvious it becomes that I need to study linear algebra.

    • @tharsisharmonia9316
      @tharsisharmonia9316 4 роки тому +7

      August has been marked down as Gilbert Strang month.

    • @juanferove
      @juanferove 4 роки тому +7

      Try this series by 3blue1brown, it's quite good for starting:
      ua-cam.com/video/fNk_zzaMoSs/v-deo.html

    • @superuser8636
      @superuser8636 4 роки тому

      This is a fire sub comment section.

    • @tharsisharmonia9316
      @tharsisharmonia9316 4 роки тому

      I think the tricky thing is trying to work these didactic journeys into fresh territory within a 'practical' project. It's the old story of problem based learning.

    • @pipertripp
      @pipertripp 4 роки тому +1

      @@juanferove yep. I'm aware of that series and I'll defo check it out when the time comes.

  • @anus4618
    @anus4618 7 місяців тому +4

    Hi. This is Anu. Thanks for sharing the concepts to the world. Since I started watching your videos, I'm more interested to learn and implement SINDy algorithm. Since I'm a beginner , i was trying to model for the simple ODEs with control input. But I'm struggling a lot. Instead of showing the code which was presented as a presentation, could you please make a videos of how to implement SINDy from the very basic steps of coding. It will be very useful for students like us to understand. In the videos of Alan, he have imported the dataset class which was created before and used the functions which were created before , it became hard for beginner students like us to grasp the concepts

    • @chidiokoene4591
      @chidiokoene4591 3 місяці тому

      This is exactly what i wanted to comment

  • @noahbarrow7979
    @noahbarrow7979 3 роки тому +1

    jpw am i only the 713 person to like this??? entire civilizations need to hit that "like" button. You guys are doing amazing work. Looking forward to all of your new research!

  • @sammflynn6751
    @sammflynn6751 4 роки тому +2

    WOW, this can be game changing for control theory and controller design

  • @marcocarnaghi294
    @marcocarnaghi294 4 роки тому +1

    Im working on non-linear models for interleaved converters and, in the short time, i was expecting to use Sindyc to obtain a model, so this seems like an excelent guideline to code the method in MatLab!! Thanks!!!

  • @sergioa.dorado-rojas7624
    @sergioa.dorado-rojas7624 4 роки тому

    Dr. Brunton... You always manage to be ten steps ahead of what I'd like to do in my dissertation XD jk... Thanks for the inspiration.
    Thumbs up for the great work!

  • @anthonyrepetto3474
    @anthonyrepetto3474 4 роки тому

    wow, you are the future of science! thank you :)

  • @angelalopez1461
    @angelalopez1461 4 роки тому +1

    Good job!!
    We’re gonna try this package, I was waiting/looking for something like this, thanks!!

  • @samirelzein1095
    @samirelzein1095 3 роки тому +1

    we need tens more of such tutorials.

  • @MrProzaki
    @MrProzaki 4 роки тому +1

    Exactly what i was looking for omg!!! Much appreciated thank you!

  • @wenhanzhou5826
    @wenhanzhou5826 2 роки тому

    This is very cool, good job and thanks for sharing!

  • @Dynamics556
    @Dynamics556 4 роки тому +1

    Great video love your style seriousness!! Like it

  • @gregorymacchio4077
    @gregorymacchio4077 4 роки тому +1

    So excited to use this!

  • @craigcollings5568
    @craigcollings5568 2 роки тому

    Thanks Brian.

  • @prateekgundannavar3207
    @prateekgundannavar3207 4 роки тому +2

    Many thanks for sharing this work. Quick question: what happens when there are discontinuities in the measurements, i.e., the differential equations are not necessarily smooth? Both Fourier and polynomial libraries represent continuous functions, however a linear combination of them can in some sense represent a discontinuity. However, it is not clear if pySINDy can capture these discontinuities.

  • @l.mansouri2902
    @l.mansouri2902 4 роки тому

    Excellente presentation

  • @jiabeiba
    @jiabeiba 4 роки тому +1

    Great video....

  • @rachidsaadane8225
    @rachidsaadane8225 4 роки тому +1

    Great job!

  • @JosephRivera517
    @JosephRivera517 4 роки тому +1

    This is so amazing.

  • @wy2528
    @wy2528 3 роки тому

    That is really interesting !

  • @chidiokoene4591
    @chidiokoene4591 3 місяці тому

    I recently learned of SINDy as i'm currently working on exploration of Equation discovery practical workflows for ML projects for my host internship firm.
    I have been trying to use the library but have not been successful due to a series of errors eg, the np.math.factorial, or the normalize argument, is it that I am not using the correct version of PySINDy.
    Please I will so much appreciate a tutorial on how to use the package

  • @chidiokoene4591
    @chidiokoene4591 3 місяці тому

    please can there be a series of PySINDy tutorial from the basic

  • @theidealisticman
    @theidealisticman 3 роки тому +2

    Hi Steve, I just watched your lecture on linear regression. Where the idea is solving for a vector x of coefficients (where Ax=b) that model a linear system, given input and output data from the system. I have tried implementing this idea to a system I have but unfortunately, the performance was poor. I suspect my system is non-linear. My question now is on whether it is possible to pose my input-output data in a form that woud be solvable with SINDy? I see SINDy takes in data of states of the system. But I do not have states, I have input-output data.

    • @Eigensteve
      @Eigensteve  3 роки тому

      Great question. For input-output data, we usually try something like time-delay embedding to enrich the state. There are a few videos on this method (called HAVOK) and also some recent SINDy videos giving more details about how to use in the real world.

    • @theidealisticman
      @theidealisticman 3 роки тому +1

      @@Eigensteve Hi Steve, thank you for your response. But the time-delay embedding is only applicable if I have access to the states of the system, no? I only have input-output data.What is the variable x that we measure in HAVOK? Is it a state of the system?

    • @Eigensteve
      @Eigensteve  3 роки тому

      @@theidealisticman Good question. So your output is generally some function of your state, y=g(x). So this does have some information about x, although it is incomplete information. The time delay embedding can help get more information about "x" from delayed measurements of "y". The delay vector won't be exactly the same as x, but under some conditions it can recover enough information to be useful for modeling.

    • @theidealisticman
      @theidealisticman 3 роки тому +1

      @@Eigensteve Hi again Steve, I apologise for the back and forth, I hope you have time for it.
      I have just used the HAVOK algorithm to model my system. I used output data, y, from my system (corresponding to some input data u) as ''x" as you suggest above. The problem I am now encountering is that the model does not seem to have predictive power. The model can reproduce the output data I used to model the system almost perfectly, but this relies on that "forcing term", vr. And this vr relies on me having known the output of the system. How can this model predict the output, y, corresponding to some other input data?

    • @Eigensteve
      @Eigensteve  3 роки тому +1

      @@theidealisticman This is a good point. For chaotic systems, the forcing vr doesn't really help with precise predictions, but it can help give an indication of a big change that is upcoming, like switching lobes in the Lorenz system. But for non-chaotic systems, you don't need the forcing term: it is possible to get a closed model vdot = A v without adding + B*vr. So for these systems, it is more likely to get a good prediction. Quasi-periodic systems should work

  • @sgrouge
    @sgrouge 4 роки тому +1

    Can it be applied to analyse financial timeseries???

  • @cy-ti8ln
    @cy-ti8ln 3 роки тому +1

    Can we use PySINDy in car design problems, for example in crash simulations or etc. ?

    • @Eigensteve
      @Eigensteve  3 роки тому +2

      In general, you can use this on any system where you have good data and would like to build a fast, flexible model that is minimally parameterized. So lots of fields could be open for this.

    • @cy-ti8ln
      @cy-ti8ln 3 роки тому +2

      @@Eigensteve thanks for rapid response. I examined SINDy algorithm videos today, from morning to night. It is actually brilliant for extracting the models. I finally understand that autoencoders will automatically extract the useful coodinates with certain weights and combined loss function select the best coordinates and corresponding weight pairs. But I didnt understand what actually lasso does. May you explain it briefly in 2-3 sentences ? Thank you.

    • @Eigensteve
      @Eigensteve  3 роки тому +1

      @@cy-ti8ln We don't really use LASSO for SINDy, although it is a good analogy, since LASSO is widely used in statistics. LASSO is essentially a least-squares regression with an L1-norm regularizer on the unknown vector being solved for. Here is a video I did on LASSO a while back: ua-cam.com/video/GaXfqoLR_yI/v-deo.html

    • @cy-ti8ln
      @cy-ti8ln 3 роки тому +1

      @@Eigensteve It is clearer for me now. You actually modify loss function of autoencoder by means of lasso loss function philosophy.

  • @Dynamics556
    @Dynamics556 4 роки тому +2

    U should have your own channel

  • @jeffdaniels6327
    @jeffdaniels6327 4 роки тому +1

    Very nice work! I will send a pull request if I find something to contribute.

  • @malinivyakaranam1036
    @malinivyakaranam1036 4 роки тому +2

    Thank you Steve for PySindy, it’s really an amazing library, have a quick question for you- what is the minimum number of measurements that you would need to successfully model?

    • @Eigensteve
      @Eigensteve  4 роки тому +4

      Generally having a lot of resolution in time and clean data is important. Co-author Kathleen Champion showed that it is possible to identify the Lorenz system with less than one full oscillation (arxiv.org/abs/1805.07411).

  • @Daniel88santos
    @Daniel88santos 4 роки тому +1

    Really interesting material! ... But I'm with a doubt. What I should use time embedding + DMD or this SINDy algorithm ? In which situation I should use one or the other? Thank you.

    • @loiseaujc
      @loiseaujc 4 роки тому +2

      Hi Daniel,
      This is an excellent question ! DMD is a method enabling you to obtain jointly a low-dimensional representation of the state and a linear model of its dynamics. Nothing prevents you however to simply use DMD as a pure dimensionality reduction technique and then use the time-series of the DMD modes' amplitudes as the input time-series to identify a nonlinear model with SINDy. This is actually my rather typical workflow and I have recently illustrated it here (arxiv.org/abs/1911.07920). From a practical point of view, I tend to prefer DMD over POD/PCA for low-dimensional embedding of dynamical systems as it somehow incorporates info about the causal link between x[k] and x[k+1] while POD/PCA essentially just aims at obtaining a low-rank approximation of the second-order statistics of x.

  • @ThomasHaberkorn
    @ThomasHaberkorn 3 роки тому

    Does it calculate the transfer function?

  • @alkiriiiic
    @alkiriiiic 4 роки тому +3

    Hi, i am a computer engineering student from Chile.
    Can i use some references and clips from your video for a lecture? obviously with references

  • @Quanja77
    @Quanja77 4 роки тому +1

    This is a gold mine lead. Thanks a lot! Question: How about Laplace/freq domain representation?

    • @Eigensteve
      @Eigensteve  4 роки тому +1

      I have a series called "Control Bootcamp", and this has a video on Laplace domain. I have a few more videos on this coming soon.

    • @Quanja77
      @Quanja77 4 роки тому

      @@Eigensteve nice! Data to Laplace sysId is always fresh.

  • @hfkssadfrew
    @hfkssadfrew 4 роки тому +1

    Hope ISINDY is also added in the package

    • @Eigensteve
      @Eigensteve  4 роки тому

      We are working on it, but we hope the community adds more functionality as needed.

  • @danielhoven570
    @danielhoven570 4 роки тому +1

    Great work guys! Do you plan on any releases for the C environment?

    • @Eigensteve
      @Eigensteve  4 роки тому +1

      No immediate plans, but this could be interesting. I know there are efforts to scale these algorithms up for some of the DOE machines.

    • @danielhoven570
      @danielhoven570 4 роки тому

      Steve Brunton That’s pretty cool! This release is just in time to integrate with my Nvidia Jetson nano. I’m trying to build a real time system ID and MPC for an autonomous racing drone. I was going to use a “manual SINDy” where I back out a guess of a sparse nonlinear system, but this is better.

  • @MantoshKumar-vc2qy
    @MantoshKumar-vc2qy 3 роки тому

    ImportError: cannot import name 'trapezoid' from 'scipy.integrate' (C:\Users\mann\anaconda3\lib\site-packages\scipy\integrate\__init__.py)
    I am getting this error when i run this line
    import pysindy as ps
    please resolve this asap

  • @drumbum7999
    @drumbum7999 4 роки тому +1

    if you're calculating the Xdot matrix from the position data you're going to have n-1 columns

    • @Eigensteve
      @Eigensteve  4 роки тому +1

      Depends on how you compute the derivative. If you use forward difference for the first point, central difference for the middle points, and backwards difference for the end point, you should be fine. There are lots of other ways to compute these too.

    • @drumbum7999
      @drumbum7999 4 роки тому

      ​@@Eigensteve that's a great point I'm almost embarrassed that I didn't consider any other schemes for computing Xdot but I appreciate you refreshing my memory

  • @cnbrksnr
    @cnbrksnr 4 роки тому +1

    how can one ensure that the algorithm is not "over-fitting" ?

    • @Eigensteve
      @Eigensteve  4 роки тому +3

      The sparsity promoting optimization in PySINDy helps to discover models that have the fewest terms required to explain the data. This generally helps to prevent over fitting.

    • @juanferove
      @juanferove 4 роки тому

      Besides from avoiding over fitting by design, the package also comes with the possibility of using regularization on the loss function used. I haven't yet tried it myself though. (See here: pysindy.readthedocs.io/en/latest/api/pysindy.optimizers.html?highlight=regularization for the api and here pysindy.readthedocs.io/en/latest/tips.html#regularization for a discussion on it)

  • @dr.gordontaub1702
    @dr.gordontaub1702 4 роки тому

    I couldn't quite make out the URL for the original SINDy paper in the video. Can you post the URL for the original paper in the comments?

  • @Dynamics556
    @Dynamics556 4 роки тому +1

    The question is why take an example of a particle? Why not use it to solve real problems?

    • @Eigensteve
      @Eigensteve  4 роки тому +2

      We always like to start with simple illustrative examples, because then we actually know what is going on. But several researchers have applied these methods to identify the dynamics of real systems. Here are a couple links for fluids by Jean-Christophe Loiseau: (arxiv.org/abs/1611.03271; arxiv.org/abs/1706.03531; arxiv.org/abs/1911.07920)

  • @FBCDC
    @FBCDC 4 роки тому +1

    Great video! Thanks!
    Question: How well does PySINDy perform with limited time series data?

    • @Eigensteve
      @Eigensteve  4 роки тому +1

      Generally having a lot of resolution in time and clean data is important; the absolute duration of the time-series data might be less important. Co-author Kathleen Champion showed that it is possible to identify the Lorenz system with less than one full oscillation (arxiv.org/abs/1805.07411).

  • @liamwhite7710
    @liamwhite7710 4 роки тому

    Is there any way of obtaining this specific example shown in the video?

    • @brian9806
      @brian9806 4 роки тому +2

      Sorry for the delay, but I've added the materials to the PySINDy GitHub repository. You'll find them under docs/UA-cam.

  • @ahmedcelik5448
    @ahmedcelik5448 2 роки тому

    Is there any pyDMD?

  • @tonyudasher954
    @tonyudasher954 4 роки тому

    I would like to kindy signal some issues about this video: the speaker did not say that the video is not a general explication or tutorial about methods for finding nonlinear equations. It seems to me that the indended pourpose is the presentation of a package realzed by the speaker, but also in this case this video do not give a sufficent quantity of descriptions about the package presented. Would be really nice if the speaker will say at the beginning of the video that this video is only an announcement about the realization of a package, this is necessary because there are many very good tutorial videos about finding nonlinear equations using a set of data (points) and a person that do not know well the subject will be confused by this video (if she/he interpret this video as a tutorial). I really hope that the authors will add a message that declare the intended pourpose and audience of this video. Thank you in advance for the kind attention.

  • @saitaro
    @saitaro 4 роки тому +3

    Where is Steve, is he safe?

  • @ThePflasterle
    @ThePflasterle 4 роки тому +1

    Thank you- this looks great! Two questions:
    - is there a plan to extend this to your new SINDy-PI algorithm, which seems way more robust? github.com/dynamicslab/SINDy-PI
    - could you comment on using multiple initial conditions a bit more? I haven't seen much discussion ofthis in the SINDy papers, maybe missed it. In my application (neuroscience) I have many 'trials' (measurement runs) with different (unknown) initial conditions. So rather than having one (or a few, as discussed here) initial conditions with many variables; I have a few (noisy) variables but man measurement runs with varying initial conditions. Is there anything to say about a potential trade-off for fitting a model when using many different initial conditions, which I imagine could introduce its own set of issues (having to infer initial conditions)?

  • @sanjayrathva9871
    @sanjayrathva9871 4 роки тому

    @atif khan

  • @jeffdaniels6327
    @jeffdaniels6327 4 роки тому +1

    Very nice work! I will send a pull request if I find something to contribute.