The Beauty of Linear Regression (How to Fit a Line to your Data)

Поділитися
Вставка
  • Опубліковано 25 лис 2022
  • In this video, we'll explore the concepts surrounding linear regression. Linear regression is very useful in math, science, and engineering, and is a gateway to other kinds of regression, and optimization problems in general.
    Download the Linear Regression Example Code here: pastebin.com/7cgh951s
    Thanks to fesliyanstudios.com for the background music! :)

КОМЕНТАРІ • 256

  • @RichBehiel
    @RichBehiel  Рік тому +42

    Hi everyone, this video has been getting a lot of views lately so I just wanted to say thank you, and I really appreciate all the positive feedback. It’s great to see such a positive response, and I’m glad that so many people are enjoying linear regression! :)
    I also appreciate the constructive criticism! A few of you have pointed out that the music is distracting, the motion is too repetitive, and the pace is a bit slow. I didn’t see that when posting the video, but I can totally see where you’re coming from, so I’ll definitely take that into account when making future videos. This was one of my earlier videos and I was still figuring things out. So I really appreciate your feedback, and I hope these videos will get better over time.

    • @myetis1990
      @myetis1990 Рік тому +2

      You are not only teaching math stuff but also teaching how to think, thank you very much for the great video.
      Really inspiring, glad i discovered this channel, waiting for the videos about jacobian , translation , rotation, quaternions

    • @ehfik
      @ehfik 8 місяців тому

      the constant animation loop gets a bit annoying. reversing, stopping and changing the animation from time to time would be a solution (and your newer videos are even better anyway!)

    • @RichBehiel
      @RichBehiel  8 місяців тому +1

      I agree. Honestly I look back on this video and cringe at a few of the details, like how the animation loop goes on and on and is a bit nauseating, and music is too loud. But you live and learn! 😅 When I first started making these videos I really had no idea what I was doing.

    • @phenixorbitall3917
      @phenixorbitall3917 6 місяців тому

      @RichBehiel 18:19 on the left hand side you used Laplace Symbol instead of Nabla Symbol.
      But except that => great video! 👍

    • @atticmuse3749
      @atticmuse3749 2 місяці тому +1

      With regards to pacing, I want to say that I really enjoy your general presentation style. You're not simply reading a script and getting the perfect take, you're actually doing a "live" presentation and I really appreciate the way you ad lib or go off on little tangents. I burst out laughing in your buoyancy video when you read the integral "zndS" phonetically.

  • @patricktanoeyjaya4430
    @patricktanoeyjaya4430 Рік тому +69

    I really love how calmly you speak and how the lines you say feel unscripted. Makes it feel very personal.
    You also speak so clearly and concisely. I was able to get the gist of this with only high school calculus!
    This is making me like math again.

    • @RichBehiel
      @RichBehiel  Рік тому +2

      I’m very glad to hear that! :)

  • @TheRiverNyle
    @TheRiverNyle Рік тому +70

    As an Applied Math (Stats/Probability Theory focused) major, this really got me excited!

  • @user-pw5do6tu7i
    @user-pw5do6tu7i Рік тому +20

    unbelievably crisp explanation of gradient decent. It is remarkable to see it play out in those dimensions. Thank you

    • @whannabi
      @whannabi Рік тому +2

      And he repeats the animation so we can assimilate what's going on instead of quickly switching to the next thing. Very relaxed explanation which is nice.

  • @matteokimura1449
    @matteokimura1449 Рік тому +28

    Another beautiful way to get a linear regression formula is to take the vector space of all real-valued functions that are defined for the x values, choose the hypothetical ideal function that maps all of the x's to their y's, and orthogonally project that hypothetical function onto the subspace of linear functions. By defining the inner product as the cartesian dot product between the output of the functions due to the x values, you'll see that the distance the projection minimizes is the error between the linear function and ideal function.

  • @andreiimbru6835
    @andreiimbru6835 Рік тому +16

    As an Econ Major, you have no idea how much this helped me understand the behind the scenes of regression lines and everything I've done in Statistics this semester, I've learned soo many new techniques with equation manipulation so, thank you!

  • @zeyogoat
    @zeyogoat Рік тому +9

    A rare video that's technically adept and, most importantly, not condescending or pedantic! Well done, from a chemist and educator =)

  • @mroygl
    @mroygl Місяць тому

    This is a piece of art, a captivating blend of deep understanding of the matter, beauty of plain graphics, voice acting, matrices, and "simple" software.

  • @MattHudsonAtx
    @MattHudsonAtx 2 місяці тому +1

    I saw the calculus approach coming a mile away but it's great to see the linear algebra done so clearly. I need to take that again.

  • @johnstuder847
    @johnstuder847 6 місяців тому +3

    Thank you! This is definitely one of UA-cams math gems! Ties so many ideas together. I would love for you to do a video on Fourier Epicycles. For reference, GoldPlatedGoofs ‘Fourier for the rest of us’ is a great starting point. I’m sure you could do a beautiful refined version showing how the Inner Product, Fourier, QM, function spaces and Art all come together in a beautiful way.
    Thank you so much for your sharing your videos!

    • @RichBehiel
      @RichBehiel  6 місяців тому

      Thanks for the kind comment, John! :) I touch on Fourier analysis in my upcoming video on relativistic QM, the Klein-Gordon equation. Hoping to upload it within a week.

  • @tommyproductions891
    @tommyproductions891 Рік тому +12

    great video! I love how at the start you explain the equation of a straight line and by the end it's multivariable vector calculus

  • @Liberty5_3000
    @Liberty5_3000 Рік тому +23

    It's so beautiful! Thank you a lot! I hope your channel is gonna grow fast soon

  • @berndkopera7723
    @berndkopera7723 Рік тому +7

    Absolutely beautiful visualization! Simple, smart and intuitive.

  • @M.KRISHNAKANTACHARY
    @M.KRISHNAKANTACHARY Місяць тому +1

    Thanks a lot for clearly explaining the concept of fitting a linear regression so beautifully.

  • @simonleonard5431
    @simonleonard5431 Рік тому +4

    Thank you! I've been playing with a spherical geometry problem and there's so much I've forgotten from my school days. This video reminded me of so many things, including ways of expanding my approaches to problem solving. Brilliant 👌

  • @atticmuse3749
    @atticmuse3749 2 місяці тому +1

    12:16 "it should keep you up at night"
    Very apropos considering it's almost 4:30 am right now and I've been watching your videos for hours 😅

  • @jiadong2246
    @jiadong2246 Рік тому +1

    Great work! Thank you, and I'm looking forward to your linear regression and gradient decent videos you mentioned at the end of the video

  • @ivopfaffen
    @ivopfaffen Рік тому +2

    Sooo cool! As a cs major struggling with a numerical analysis class, this helped me understand linear regression so much better.
    Thanks man!

  • @xxge
    @xxge Рік тому +2

    Great video! Coming from a linear algebra heavy background I still think taking the singular value decomposition of X, inverting it, and multiplying by y to find b is a more elegant and simple approach especially for multiple linear regressions, but I imagine if you have more experience with physics this approach would be more familiar and easier to digest. Keep these videos coming!

  • @sujalgvs987
    @sujalgvs987 Рік тому +4

    I absolutely loved this video. Please do more videos on regression and machine learning as a whole.

  • @ehfik
    @ehfik 8 місяців тому +1

    this was SO satisfying! hope to see many more explanations, such a great execution!

  • @Ayesha_F
    @Ayesha_F Рік тому +3

    Oh this was so SATISFYING! I don't think i have ever seen regression explained this way. It's like parts of how i understand it, is being so wonderfully articulated by someone who obviously knows the subject matter well. I have had to teach myself mathematics and statistics, and I've always been drawn to this intuitive and philosophical way of understanding it. Thank you for this!

    • @RichBehiel
      @RichBehiel  Рік тому +2

      Thanks for the kind comment, and I’m glad you enjoyed the video! :)

  • @dadamczyk
    @dadamczyk Рік тому +3

    Great video! With those animations it would be wonderful to see an essay about bayesian linear regression since it is quite different and powerful approach to similar topic.

  • @anthonyrojas9989
    @anthonyrojas9989 Рік тому +2

    This was amazing! So fun to watch and appreciate this concept.

    • @RichBehiel
      @RichBehiel  Рік тому

      Thanks, glad you enjoyed the video! :)

  • @enricolucarelli816
    @enricolucarelli816 2 місяці тому +1

    Wow! This is perfection explaining/visualizing complexity and its beauty! ❤❤❤❤ 👏👏👏👏👏

  • @tesstera
    @tesstera Рік тому +1

    Amazing! Thanks for showing us how to solve a Maths problem in a Physics way. Even though this method has been used in nowadays AI already, it is still very interesting to see it works outside AI. The conceptual journey you taken reminds my trial on machine proving, or ATP; and helps most to eliminate the intimidation of numerical analysis. Thanks!

  • @benwinstanleymusic
    @benwinstanleymusic Рік тому +1

    Really enjoyed this, you're great at explaining stuff

  • @jwilliams8210
    @jwilliams8210 Рік тому +2

    Fantastic presentation!

  • @alexkushnir8073
    @alexkushnir8073 Рік тому +1

    Cool music Richard, it opens my mind and makes me understand things better! It's like combining hypnosis and a class;-) I wish my math teacher at school would have explained it to us in that way 🙂

  • @TheScepticalChymist
    @TheScepticalChymist Рік тому +1

    I cannot finish the video because your voice is SO charming and comforting and makes me feel so safe, I just cannot pay attention in the maths

  • @bernard2735
    @bernard2735 Рік тому +1

    Beautifully explained, thank you. Liked and subscribed and looking forward to more.

  • @Cristi4n_Ariel
    @Cristi4n_Ariel Рік тому +1

    This was interesting! Thanks for sharing.

  • @davidandrewthomas
    @davidandrewthomas Рік тому +1

    This is beautifully put together! What a great explanation!

  • @levimillerfandom
    @levimillerfandom Рік тому +1

    I was really stuck on a practical, I have to make a graph of my readings the book Stated that I should get a straight line but instead I got curves was really stressful, but thankfully found your video,
    It really helped❤
    Thanks again

  • @Aziqfajar
    @Aziqfajar Рік тому +1

    This is beautifully explained and visualized! I'm glad to be on the first wagon for the ride of this video.

    • @RichBehiel
      @RichBehiel  Рік тому +1

      Thanks, I’m glad you liked the video! It’s one of my favorite mathematical concepts, so it’s great to see others enjoying it too :)

  • @user-hl8sv1if7j
    @user-hl8sv1if7j Рік тому +1

    wow. So well explained. Thank you

  • @coreymonsta7505
    @coreymonsta7505 Рік тому +1

    I love code and taught calc 3 a couple times, which is my favorite class, but never learned about this topic in school (only hear of its name a lot). That was really interesting

  • @benjaminshropshire2900
    @benjaminshropshire2900 Рік тому +2

    IIRC there *is* a way to leverage that outer product observation: If D is a matrix where each column is [xᵢ 1] and Y is another matrix where each row is [yᵢ] then the entire left Σ becomes DDᵀ and the entire right Σ becomes DY.
    also (I think) this actually generalizes to linear equations with more terms by adding the data as more rows in D. And the data can also be functions of existing simpler terms (e.g. Nth powers of x to get polynomial fits, sin(nx)/cos(nx) to get discrete Fourier transforms, etc.).

  • @andytroo
    @andytroo Рік тому +4

    introducing the Jacobean could be a nice extension - the shape of best fit is an ellipse, which can make converging towards the best solution hard, as many of the gradient directions in the top half of your example are not pointed towards the best solution, simply towards that valley of best fit. Reshaping the gradients to make that ellipse a circle allows much quicker conversion

    • @RichBehiel
      @RichBehiel  Рік тому

      Great idea! I’d love to do a video on that someday.

  • @CarlosHlavacek
    @CarlosHlavacek Рік тому +1

    Really beautiful class.

  • @TranquilSeaOfMath
    @TranquilSeaOfMath Рік тому

    I really like all you put into this video. It helps connect ideas in interesting ways. Thank you for including the Python code.

  • @Lado916
    @Lado916 Рік тому +1

    Great video! I absolutely love the visual and dynamical proofs in math.
    I just wanted to add that there is a beautiful point-line duality between the two spaces:
    While a dot in parameter space corresponds to a line in real space, a line in parameter space defines a family of curves in real space that intersect at the same point.
    Moreover, if you map your datapoints to their corresponding dual lines, the center of mass of these lines will be a dual point to the best fit line of the data!
    Hope you find this as cool as I do.

    • @RichBehiel
      @RichBehiel  Рік тому +1

      That’s really cool! I’ve read about that kind of thing in an intro to differential geometry book, but hadn’t connected the dots in the context of this video. Thanks for a very interesting comment :)

  • @wishIKnewHowToLove
    @wishIKnewHowToLove Рік тому +1

    he just dropped the most beautiful linear regression video and thought we wouldn't notice

  • @zeb4827
    @zeb4827 Рік тому +1

    very cool video, this connected some dots that I've been struggling to reconcile

  • @marktahu2932
    @marktahu2932 Рік тому +1

    Really very helpful - and I'm no professional in any of these fields, but just an old technician who is being reminded of all those brain neurons that have lain dormant for decades,

  • @AfroNyokki
    @AfroNyokki Рік тому +1

    Great explanation, loving it so far. I'm majoring in applied math with a focus in numerical analysis, so this stuff is always fascinating haha. I noticed around 18:20, you started using delta instead of del. Thought it might be a typo but just wanted to check!

    • @RichBehiel
      @RichBehiel  Рік тому

      Yeah that’s a typo, sorry! 😅 Thanks for pointing that out.

  • @kalaiselvan6907
    @kalaiselvan6907 Рік тому +1

    ❤️❤️❤️This is Gold ❤️❤️❤️ Thank you

  • @chrislau9835
    @chrislau9835 Рік тому +1

    Very good explanation 👍🏻👍🏻

  • @RocaSeba
    @RocaSeba Рік тому +1

    This video is genius. Subscribed.

  • @8megabitz706
    @8megabitz706 Рік тому +1

    Ive been waiting for this for too long 10:17

  • @IAmTheFuhrminator
    @IAmTheFuhrminator Рік тому +1

    Such a great video! I had a lecture about this years ago in my engineering analysis class in undergrad, but I took such poor notes that I was never able to reproduce this function. Now as homework I'm going to take your process and solve for other functions like parabolas or cubics which will require me to use 3 and 4 dimensional parameter spaces. Thanks again for the great video!

    • @RichBehiel
      @RichBehiel  Рік тому +1

      That’s awesome, I love to hear that! Challenge for you: can you solve it for a general N-degree polynomial? Like with some kind of recursive algorithm. I actually don’t know if this is possible but it seems like a fun puzzle!

    • @IAmTheFuhrminator
      @IAmTheFuhrminator Рік тому +1

      @@RichBehiel that would be a fun problem to solve! And even if it can't be solved, I'm sure proving or disproving the possibility of a solution would make a great paper!

  • @mskiptr
    @mskiptr Рік тому +1

    The parameter space is a super powerful concept. Especially in computer vision, where you can take a bunch of pixels and quickly detect all the lines they approximately form

  • @maxfitzkin9422
    @maxfitzkin9422 Рік тому +1

    I really loved how you put this video together! What did you use to animate and edit everything? It was really clean!

    • @RichBehiel
      @RichBehiel  Рік тому

      Thanks! :) I used matplotlib in Python.

  • @michahejman6712
    @michahejman6712 Рік тому +1

    Great video! 30 minutes felt like 5 :) Thanks!!!

    • @RichBehiel
      @RichBehiel  Рік тому +1

      Thanks, glad you enjoyed the video! :)

  • @ABKW119
    @ABKW119 Рік тому +4

    Why do your videos only get recommended to me at 1am, they send me straight down a rabbit hole 😂

  • @micahwithabeard
    @micahwithabeard Рік тому +1

    i just liked, subbed and commented :D i don't think i can be any more "violently complementary" than that. this was excellent thanks!

  • @pickle.taesan
    @pickle.taesan Рік тому +1

    Great video! I never thought parameter space with 'Error Force'.

  • @user-pn1lm3pi6p
    @user-pn1lm3pi6p Рік тому +1

    Very good!

  • @ydl6832
    @ydl6832 Рік тому +1

    Yeah, this is a nice explanation. Neural network is just a more sophisticated version of line fitting with more parameters.

  • @GradientAscent_
    @GradientAscent_ Рік тому +1

    Very cool animations

  • @williamfurtado1555
    @williamfurtado1555 Рік тому +4

    This video is wonderful. How did you create the interactive visualization with the "Parameter Space" and "Real Space" subplots? I'd love to be able to create one on my own.

    • @RichBehiel
      @RichBehiel  Рік тому +8

      Thanks William! :) For this video I used Python, specifically matplotlib. You can use that by downloading Anaconda, which will install Python and some scientific modules, then call “from matplotlib import pyplot as plt”. After calling that line, you can use things like plt.figure() and plt.plot() to make a figure and plot things. In this case the parameter space and real space are two subplots in a figure. They’re refreshing at 60 frames per second in a loop which sets the dot’s position in the parameter space while making the line in the real space, based on the current a and b values. To turn on the error landscape, I also added some code to evaluate the error metric (objective function) at all points in the parameter space for each a and b. Then for the error force I calculated and plotted the negative gradient of that. For the part where the dot descends down the gradient, I used F = ma - kv with mass parameter m and friction-ish parameter k to make the dot roll down the hill and then stop at the optimal point.
      I’ll be more careful in future videos to post the source code of the animations too. Well, at least for videos after the one I’m going to post this week; for that one, and the previous videos, I was very sloppy with the code and it wouldn’t be too helpful to see them. But there have been a few comments now about how these animations were made, so I figure the best answer is the code itself. In the future I’ll be better about writing cleaner animation codes and sharing them.

    • @rocknroll909
      @rocknroll909 Рік тому

      ​@@RichBehiel wow, you're awesome for such an in-depth reply to this. Thank you, I might try this on my own

  • @scienceuser4014
    @scienceuser4014 Рік тому +1

    Perfect video

  • @nooks12
    @nooks12 Рік тому +1

    Satisfying video. Took me back to University.

  • @torquencol
    @torquencol Рік тому +1

    Lmao thank you for this, this video just came into my recommendations when I needed it most: I've been stressed these last few days just doing laboratory reports, where I have to use a lot the regression line 🛌 It made me hate it less

  • @rouninph6349
    @rouninph6349 Рік тому +1

    It looks like you are trying to hypnotize your listener. 😂 Great explanation btw. Using physical arguments to explain a mathematical concept, I like that.

  • @Osniel02
    @Osniel02 Рік тому +1

    just gorgeous!!!

  • @alexander_adnan
    @alexander_adnan Рік тому +1

    Thank you 🙏 ❤❤❤

  • @PatrickDoolittle
    @PatrickDoolittle Рік тому

    I like Sujal Gupta watched this video because I am studying machine learning. I have been studying simple linear regression for the past couple weeks now! Just yesterday I started to think about how the moore-penrose psuedo inverse generalizes the idea of an inverse to situtations where the matrix is not square. I call linear maps to a higher dimensional space "embeddings" and linear maps to a lower dimensional space "projections". For a square matrix, which is neither an embedding or a projection but a linear operator in the same dimension, we can undo the linear mapping by finding the inverse X^-1. In the case of projections, there are many high dimensional vectors that can be projected down to a given low dimensional vector, so there is no unique inverse. However we can solve the system Xb=y for b using the Moore-Penrose *psuedo* inverse: (X^T X)^-1 X^T. When we apply the moore-penrose psuedo-inverse on the vector of response variable y, we project y onto the row-space of X, which is formed by the row vectors, which are linear combinations of the parameters. By projecting y onto every data point (row vector) and adding it up(in essence projecting onto the entire row space), we get our coefficients, and that is the beauty of the moore-penrose psuedo-inverse!

    • @davidmurphy563
      @davidmurphy563 Рік тому

      I code DNNs too. Um. I understood your words but not your point. Genuinely curious here.
      So we can calculate the inv matrix. Take the reciprocal of the determinant and multiply it by the matrix with the diagonal swapped and the upper/lower negated. This spits out a new matrix with the property that if you multiply that by the original you get the identity (assuming linear independence).
      Ok fine, all very useful. But what's that got to do with the price of fish?

  • @account4345
    @account4345 Рік тому +1

    Just gotta remind myself this is why I must master linear algebra.

    • @RichBehiel
      @RichBehiel  Рік тому

      Mastering linear algebra is a great and enduring source of spiritual fulfillment 🙏

  • @brianli3493
    @brianli3493 Рік тому +1

    electric potential actually helped me understand this omg

  • @DavidCaveperson
    @DavidCaveperson Рік тому +1

    Nice video on OLS. I've often wondered though why lessons on regression focus on OLS rather than Deming Regression, as OLS seems objectively inferior, so to have so many projections based on the inferior model, we are shooting our research methods in the foot from the start

    • @RichBehiel
      @RichBehiel  Рік тому +1

      Good point. Frankly I think it’s because OLS is easier, and gets the job done in most situations. But I agree that there are times when Deming regression is better. Although someone who uses Deming would presumably have learned OLS first. OLS is also conceptually ideal for explaining how calculus can be used to minimize fit error, so it’s a good go-to image to have in mind when solving fancier optimization problems.

    • @DavidCaveperson
      @DavidCaveperson Рік тому

      @@RichBehiel I completely understand, in fact, this subject is making me think about applied mathematics, because if we go deeper, it's not like linear regression in any form is the the best way to actually model most data, so I'm thinking about dividing a function into splines to create a good fit, you can go too far and smoothly fit every point into a function, but then your function is skewed towards the data set, losing the ability for good projections. It's an interesting puzzle (and I hated applied mathematics in college)

    • @turun_ambartanen
      @turun_ambartanen Рік тому +2

      Well, there are quite a few advantages of OLS compared to total least squares fit.
      For one, every measurement where x is tightly controlled and y is the thing you want to learn about, OLS is the right tool. Because there are no or only negligible errors in x, the distance of datapoints to the prediction, dx, doesn't matter and must not be included in the fit.
      And also it works much better with arbitrary functions than total least squares. For an arbitrary function I don't think there even is _any_ way to calculate the total least squares error. Only well behaved functions work, and even then you have to define the derivative to perform a total least squares fit.

  • @StudyEnggFocus
    @StudyEnggFocus 2 місяці тому

    Hello, Richard! Could you explain what you meant by error metric? Thanks

  • @SD-ni9jh
    @SD-ni9jh Рік тому +1

    beautiful vid

  • @tylerbakeman
    @tylerbakeman Рік тому +1

    Instead of calculating Dy, it might be better to calculate the distance a point is from the line (especially for smaller data sets, where Dy could be large, bur infact the line could be very close).

  • @kummer45
    @kummer45 Рік тому

    Imagine you have a surface with a magnet. That's a game changer.
    Understanding the concept of statistics doing physics is the correct way of UNDERSTANDING mathematics and PHYSICS. However physics has nothing to do with mathematics and mathematics has nothing to do with physics.
    The magic of this is MODELING. Linear regression, average, the gauss curve are concept of fundamental use in statistical mechanics. Eventually higher mathematical physics will launch the student into the field of MODEL MAKING.

  • @trustnoone81
    @trustnoone81 Рік тому +1

    Do I understand correctly that the "valley" in the error landscape is the set of all lines that pass through the point (x-bar, y-bar)?

    • @RichBehiel
      @RichBehiel  Рік тому

      Great question, and I’m actually not sure. Anyone know the answer?

  • @denisbaranov1367
    @denisbaranov1367 8 місяців тому +1

    The beauty of: Linear Regression

  • @guslackner9270
    @guslackner9270 Рік тому +13

    This video is a wonderful explainer! You've listed in the description that linear regression is "very useful in math, science, and engineering" to which I would like to add economics, which is what I am studying. This video and Jazon Jiao's work (ua-cam.com/video/3g-e2aiRfbU/v-deo.html) are the best explanations of the concept that I have seen in video, lecture, or textbook form. I look forward to seeing what else you share on this channel!

  • @einsteingonzalez4336
    @einsteingonzalez4336 Рік тому

    That’s awesome! But what happens if we let N approach infinity where the data points are in a finite domain?

  • @flexeos
    @flexeos Рік тому +1

    There is always something that bothers me when the linear regression is approched that way. It is that from the start you consider that y and y are of a different nature: the value of x is known perfectly and the error is on y. This is a pretty strong constraint. I am a metrology engineer and I saw in the comments that you are a metrology engineer too, so you are well aware that in the real world there are errors on both x and y. In which case the error could be the distance from the data point to the line for example

    • @RichBehiel
      @RichBehiel  Рік тому

      That’s true! And there are ways of doing regression with ds rather than dy. Although often x is more precise than y, for example if you have a sensor array or are sampling data at a fast and precise rate relative to the change in your signal.
      For example, if we’re looking at a trend in some signal that drifts linearly over an hour, and sampling one datapoint per second, with error on the order of microseconds, then x is very precise in that context.
      But you’re right that there are some cases where x and y might be similarly varying.

    • @flexeos
      @flexeos Рік тому +1

      @@RichBehiel my world is more the relation between 2 voltages at different location in an analog network so the noise on both are of the same nature.

    • @angelmendez-rivera351
      @angelmendez-rivera351 5 місяців тому +1

      @@flexeos I think you are missing the big picture. In most of these data sets (in practice), x(i) is the data set corresponding to the independent variable, the one which you can actually control for much more easily, and y(i) is the data set corresponding the independent variable, and you want to understand y as a function of x, not the other way around, because the other way around is (in every scenario I have seen physicists, engineers, and any other applied S.T.E.M. worker deal with) very impractical and not useful. Now, are there circumstances which are more complicated? Of course there are, but they are the exception, and in those circumstances, the complexities involved are of such a nature that dealing with residuals, as the video does, is not the practical approach anyway.

    • @flexeos
      @flexeos 5 місяців тому

      @@angelmendez-rivera351 It is not my experience in practice. let 's say that you want to measure a resistor. you inject a current I that you "control" usually using a digital to analog converter and you measure the voltage V at the edge of the resistor and V/I is your resistor. because the world is not perfect, if you want to have a better result, you do the measurement with a bunch of Is and the resistor is now the slope of the best line through the cloud of points V,I. To have a better idea of the exact value of I, while you set it digitally, you have to measure the actual value of it as the translation between the digital value and the actual current is everything but perfect. so in practice you have a cloud of points V,I with the same kind of error (noise, offset, non linearity...) on both V and I. If you assume that I is an independent variable you will end up with a bias. There was a math paper on that bias effect almost 100 years ago that I read but I cannot find the reference just right now. If an electronic example seems too specific, let's look at something that is a typical example given to students like annual income vs age in years. age looks like an independent variable, but in reality by definition there is 1 year uncertainty on it which is not too good as the relative error bar is not even constant. of course in such an example the required precision is not a big problem so you can forget about those subtleties. But in metrology you are tracking few parts per million. Not taking that into account would be like trying to design the GPS without taking the general relativistic effects into account (~accuracy on location becomes > 10kms). my 2 cents

  • @peterwolf8092
    @peterwolf8092 Рік тому +1

    😂 I realy love this and wish my highschool students would understand it so I could share it with them.

  • @nofalldamage
    @nofalldamage Рік тому +1

    Great video.
    Is the matrix at the end always invertible?

    • @RichBehiel
      @RichBehiel  Рік тому

      Great question! It’s invertible as long as its determinant isn’t zero. Since it has the form [A,B,B,N] where A and B are real numbers and N is a positive integer, so its determinant is AN - B^2. For this to be zero would require that AN = B^2, in other words for the sum of x_i^2 times N to equal (the sum of x_i)^2. I’m not sure if this can happen, feels like it can be proven one way or the other without a ton of work, but I’ve gotta go. So I leave that as an exercise to the reader! :)

    • @nofalldamage
      @nofalldamage Рік тому

      @@RichBehiel I think one of the cases where the matrix is not invertible is if all the points are on a vertical line. Kind of makes sense since then the form y = ax + b doesn't really work.

  • @JOHNSMITH-ve3rq
    @JOHNSMITH-ve3rq Рік тому +1

    Wow. Seen so many videos, read so many papers and books - but this one takes the cake. Would love to see you doing this but for more complex models with fixed effects and all sorts of other bells and whistles. Impressive!!

  • @jursamaj
    @jursamaj 8 місяців тому +1

    And you can fit to other curves with simple transforms of one or both axes, like log or exp.

  • @potatochipbirdkite659
    @potatochipbirdkite659 Рік тому +1

    Do you have the blue dot following a Lissajous curve?

    • @RichBehiel
      @RichBehiel  Рік тому

      I forget what I did for that, I think I just had some sines and cosines of different frequency in x and y.

  • @bronga645
    @bronga645 Рік тому +1

    sub, like and comment for your effort, even if you dont make much on yt you are a great mathematician! And i am sure you will make it in life and be a help to humanity as a whole. thank you

    • @RichBehiel
      @RichBehiel  Рік тому

      Thanks for the kind comment! :)

  • @kennethtrimble5144
    @kennethtrimble5144 11 днів тому +1

    excellent

  • @zukofire6424
    @zukofire6424 Рік тому +1

    Beautiful and surprised I never knew some of what you explained. I wanna add something irrelevant : you are so handsome!

  • @akidnag
    @akidnag Рік тому +1

    Great vid, thank you!
    I'm struggling knowing how do you visualize the "parameter space" in python?

    • @akidnag
      @akidnag Рік тому +1

      I did a mesh grid for a and b from -5 to 5 100 points X,Y. Then I calculate the modulus as Z= the sum of the sqrt of the ^2 of each eq and did a contourf(X,Y,Z), but no luck :/

    • @akidnag
      @akidnag Рік тому

      I think the quiver plot is ok as quiver(X,Y,eq1,eq2)

    • @RichBehiel
      @RichBehiel  Рік тому +1

      I did a contourf and a quiver. If the contourf isn’t working, it’s possible the color limits are off? Oh, actually come to think of it, I might have taken the log or sqrt of the error, to flatten out the landscape so it’s easier to see. Basically applying a nonlinear colormap.

    • @akidnag
      @akidnag Рік тому

      @@RichBehiel Thanks a lot! Keep up the great work!

    • @akidnag
      @akidnag Рік тому

      Still no good, I'm sorry.
      So in contourf is V (or log(V) or sqrt(V)) and in quiver is Fa,Fb, with spanned a and b, right?
      Sorry to bother but I feel I understand, but not having the same results make doubt what I'm doing wrong :/
      Is it too much if you share the code for visualizing the parameter space?

  • @sarthakjain1824
    @sarthakjain1824 Рік тому +2

    That was on the level of 3 blue 1 brown videos

    • @RichBehiel
      @RichBehiel  Рік тому

      Thanks! :) Grant is a role model for sure. The aesthetics of his videos are much better than mine though 😅 But I’ll get better over time.

  • @jamesmcfarlane3469
    @jamesmcfarlane3469 Рік тому +1

    Is this method, or something similar applicable to non linear least squares? I did a project over Christmas using non linear least squares regression and this would’ve been super helpful 😅

    • @RichBehiel
      @RichBehiel  Рік тому +1

      The same concept of minimizing a least squares objective function by setting the gradient to zero applies to nonlinear least squares, but there are also extra steps involved.

  • @peterwolf8092
    @peterwolf8092 Рік тому +1

    Is it possible to get a „second best“ valley? A pseudo best solution?

    • @RichBehiel
      @RichBehiel  Рік тому +1

      Not for linear regression, but for fits with more parameters yes. Gradient descent can sometimes get stuck in a local minimum, a valley other than the best one. If there’s an analytic solution, it might involve the roots of a polynomial or something, so you can have multiple values which are locally optimal. In that situation, the height of the objective function at each optimum can be quickly compared, since the list should be pretty short.

  • @badermuteb1012
    @badermuteb1012 6 місяців тому

    How did code these interactive plots? Thanks

  • @benandrew9852
    @benandrew9852 Рік тому +1

    holy shit
    I have genuinely never even come close to thinking about it like this
    top marks, no notes

  • @BrunoJMR
    @BrunoJMR Рік тому +1

    When calculating the zero gradient, how do you avoid the local minimum problem? They are also zeroes of the gradient

    • @RichBehiel
      @RichBehiel  Рік тому +1

      True! For more complicated fits, the parameter space becomes more textured and you’ll often have multiple local minima. But with an analytic solution, these minima can be quickly calculated, for example as roots of a polynomial. Then there’s just a small list of points at which the objective function can be evaluated and compared, and the minimum can be chosen from the list.

    • @BrunoJMR
      @BrunoJMR Рік тому +1

      @@RichBehiel Thanks! So the analytic solution gives us all the minima and we then can just check which one is the lowest. Cool

    • @RichBehiel
      @RichBehiel  Рік тому +1

      Yup. There may be some maxima and saddle points in there too, since those also have zero gradient, but those can either be filtered out analytically by solving some second derivatives for additional constraints, or just kept in the list and they won’t be the minimum so it doesn’t matter. In practice, people almost always do the latter. The only exception would be if the data rate is very high and there’s some benefit to solving those equations in exchange for a marginally faster routine. So in super high performance scenarios, the second derivatives are worth looking at.

  • @willbedeadsoon
    @willbedeadsoon Рік тому

    When I run a code in VS code it shows nothing, but if I start debugging line to line at "plt.subplots(figsize=[8, 4.5])" it shows matplot window. Ités weird for me. What's going on here?

    • @RichBehiel
      @RichBehiel  Рік тому

      Hmm… I’m not sure, tbh. Do you have all the modules installed? I’d recommend installing Anaconda, then running the code in Spyder (it comes with Anaconda). That way you’ll have a lot of mathematical and scientific modules already installed. Plus, Spyder looks cool.

    • @SkrtlIl
      @SkrtlIl Рік тому

      Not sure why you get the window in debugging mode, but for normal python scripts you usually have to call plt.show() manually while notebooks would trigger them inside the corresponding cell. So you may also change your .py to .ipynb and run that in vscode

  • @sgtreckless5183
    @sgtreckless5183 Рік тому +1

    Is the direct product in the final formula always non-singlular, and so always has a inverse?

    • @RichBehiel
      @RichBehiel  Рік тому

      I believe so, but I’m not 100% sure actually. As a good exercise in math, you can explore if it might be noninvertible under some conditions, just set the determinant to zero and see what a dataset would have to be like in order for that to happen.
      I’ve done millions, maybe billions, of linear regressions (on data streams) and have never run into this problem though.

    • @sgtreckless5183
      @sgtreckless5183 Рік тому +1

      @@RichBehiel Doing just the quickest amount of working out with a dataset of 3 values, I think the sum of outer product would only be singular if all the x values are the same, which obviously isn't going to happen. It's fairly easy to show that if we have a dataset like this, the matrix is singular (the 1st row of the matrix is just the second multiplied by x_i), though I'm not sure how you'd prove it the other way around (i.e. the matrix is non-singular in all other cases).

    • @RichBehiel
      @RichBehiel  Рік тому

      That makes sense! Btw, these equations are equivalent to a force and torque balance, if the residuals are imagined as elastic springs, so physically it makes sense that it would only be singular if the x values are all the same, or something like that.

  • @agentdarkboote
    @agentdarkboote 9 місяців тому

    I would love it if you could show why the pseudoinverse recovers this method!

  • @PrismaticCatastrophism
    @PrismaticCatastrophism Рік тому +1

    Could you make similar video about parabolic graphs?

    • @RichBehiel
      @RichBehiel  Рік тому

      I’d like to someday! The procedure is very similar, but ax^2 + bx + c instead of ax + b. It’s a 3D parameter space, but the same techniques work.

  • @m9l0m6nmelkior7
    @m9l0m6nmelkior7 2 місяці тому

    But is that matrix invertible if there is more than one extremum ?