This video made me so happy, best 35 minutes of my day for certain, maybe my whole week. Cheers for this; if you see this comment 3 years later, you're a legend for taking the time to explain this so clearly.
@@donegal79 the job of a good teacher is to point a student in the right direction and not just hand wave complex topics. Plus, he did go and look for the proof as is evident in him watching this video.
What I love about this presentation, unlike some others I have seen, is that it does not skip over any steps. Each step is very clear and easy to follow. Wonderful job sir., thank you.
This video is absolutely brilliant! I've always wanted to know how the normal distribution curve was derived, and your explanation was perfect! Thanks so much!!! Math is beautiful
@Ask why? ? differential equations. It's the maths for working out when more than one thing is changing at the same time. As opposed to normal calculus which only changes one thing. Like how far a ball goes if you throw it, if you change both the angle and how hard you throw.
@@stevenson720 I know my question is about 11 months too late, but regarding your reply, I.e. "differential equations ... for working out when more than one thing is changing...", etc. , isn't that actually what multivariable calculus is actually for? I suppose it really depends on whether we're talking about ordinary diff equations or partial diff equations, since ODE's deal with one independent variable, while PDE's are for multiple independent variables. I'm no math major, just an independent learner and lover of maths, so my response might not be 100% accurate, but if it's not, anyone may feel free to correct me.
It’s beautiful to observe that the number of UA-cam “likes” decrease as the video is of educational nature and not about useless make-up tutorials etc. This itself IS a proof that the number of curious people actually wanting to understand and therefore watching this helpful video completely are from the “other side” of the Gaussian Distribution. ;-) Thanks for the fantastic job!!
This video is a brilliant illustration of Einstein's famous sentence: "Everything should be made as simple as possible, but not simpler". The derivation is beautiful, elegant and crystal clear. Thanks so much for sharing your knowledge!
I'm so grateful that I found this absolute gem of UA-cam, keep posting videos. You probably inspired thousands of people to be more interested in math/science
It was a good joke though... But the last equation, the general one covers that too. Bad at throwing darts?? The value of σ is more. Or if your good at throwing darts but your shots clusters somewhere other than bull's eye. Your μ will be different. So never loose faith in maths, specially in general maths.
Fantastic explanation, step by step and does not assume the viewer knows any particular mathematical derivation. best video i've seen on the derivation of the normal distribution!
I'm from Brazil and i just found this class right here. I ask to my professor how to derivate this formula and she did not know it. This was one of the most impressive class that i ever saw, Thank you so much!!!!
Beautiful derivation of a ubiquitous formula: the hat trick comes at 22:48! To complete the discovery of this treasure, I am now deep diving to revise my Euler integral. Thanks so much!
Brilliant!!!! This is a MUST for any student of statistics. The way statistics are normally taught... many unanswered assumptions... which are answered by this video.
Great vid. Cleared up much of the mystery of where normal distribution comes from. Have forgotten most of my calculus but was still able to follow along!
Wow, this was awesome. I'm reading E. T. Jayne's Probability Theory. In Chapter 7, he performs this derivation, but as is often the case, he assumes the reader is as fluent as he is with functional analysis. This video really helped me fill in the gaps. Can't wait to watch the rest of your videos. Thanks a bunch!
I am incredibly new to statistics and have never actually taken a course, but I have taken physics and engineering courses that apply physics; it's pretty neat to see that the dirac delta function is a limiting case of the normal distribution in which lambda approaches infinity, which I realized as soon as you showed how lambda transforms the shape of the graph. Very cool video!
Esto es bellisimo, me ayudo de una manera impresionante en mi clase de diseño de experimentos. Simplemente gracias!, no lo hubiera logrado sin ti This is beautiful, it helps me in an impressive way in my experiment design class. Simply thank you!, I wouldnt achieve it without you
I love this video, this derivation is spectacular, I love it when mathematics links various seemingly unrelated concepts with each other and yields this beauty. That being said, I also think Gauss' proof in the article you provide is far easier and more accesible to students like myself.
I don't know what to say but you just saved my life. I have been looking for a proper derivation and not the 5 mins ones for months. Thank you thank you thank you so so so much.
Wow, amazing explanation that cleared up everything about the normal distribution for me! You’re calm way of teaching is very clear and enjoyable, thank you!
I'm software developer. I'm pretty comfortable with discrete math and terrible with calculus. But your explanation is so clear that I was able to understand most of it. Thank you for such high quality content. This video deserves more views! I'm subscribed (which happens rarely!).
Really well explained. I'm relatively new to this type of thinking and it was illuminating! On the fly definitions of new functions that lead you in any direction you'd like, seems really powerful, but also like a puzzle. Thanks for making this video!
For the transition from y = 0 to the general case, the 1D equation can be generalized to 2D due to radial symmetry, which makes the x axis equivalent to any other line going through (0,0). Regarding the number of dimensions, a minimum of two is necessary to specify coordinate independence and radial symmetry, which together give the form of an exponential. Lovely, unique video. Thanks!
Thank you for the brilliant derivation from nearly the First Principles. Thank you indeed. I deeply expect that there should have been a "History of Mathematics" YT channel, along with providing the reasons for the clever decisions taken at crucial steps to derive historically important equations. These steps have nothing to do with computation, but just a "Leap of Intelligence", because of which mathematics has prospered for so long. Began with Pythagoras's proof of root2 being an irrational number.
We love you from the bottom of our hearts. Everything is soooooooooo clear unlike other videos on youtube. I am trying to learn data analytics, on my own from youtube for free Cause I want a neww skill your standard deviation
Awesome explanation...I have been trying to understand the function behind normal curve all this while and this is so beautifully explained...thanks a ton
In your opinion, what looks nicer when written out: Gaussian distribution or the Schrödinger equation? And in general, what formula do you think is most aesthetically pleasing? Also, if you haven't seen it yet, look up Gauss's signature. It's one of the best things ever.
I like how you're writing the integral signs! Anyway this is a super clear video, thank you so much! I'm reading Maxwell's 1859 paper and I wasn't sure where the Ce^Ax^2 came from.
The formula tries to check how much data has varied from the mean value. The square is to generalize those which have varied less (-ve) and more (+ve) to the mean
So the formula for variance is Summation of all the Squared Deviations; ie Sum ([X-Xbar]^2). In a continuous setting, we integrate rather than sum, therefore we integrate ([X-Xbar]^2) and multiply it with the PDF. As Xbar, or the mean, is equal to 0, we only end up integrating [X]^2*pdf from negative infinity to positive infinity. Hope this helps man. en.wikipedia.org/wiki/Variance#Absolutely_continuous_random_variable
Amazingly satisfying video; I've enjoyed your videos for a long time and this was particularly good. Everything is explained at just the right level and the derivation was so logical it all felt obvious by the end. Thank you for putting so much effort into your videos!
hey great idea for a video. i found in undergrad that my stats prof didnt really care to elaborate on this, and so stats ended up being my least favorite math class. turns out you need to understand this stuff to do my actual job, so this is much appreciated
Class teacher often omit the derivation of Normal Distribution. I always wonder how the bell curve formula is derived. Here the answer I am. Thanks a lot.
Terrific derivation! The only potentially ambiguous part is at 21:07 when "A" is written as the total area under the curve, whilst "A" is still written at the top right as "A = -h^2". Probably not confusing to anyone who understands the rest of the derivation, but it still bothers me a bit to have "A" on the screen twice, with two different usages. 😉
I am not sure if the equation written at 21:37 is correct; isn't φ(x) the probability distribution function and not f(x)? Why is f(x)dx being integrated?
It is correct, but only because it doesn't matter what he chooses to represent the particular probability function (may it be a distribution or density function for discrete or continuous data respectively), or the proportionality constant "lambda". I suppose if you really wanted to be pedantic and make sure that everything he is saying is correct, then after the line of work: f(x) = lambda*e^(Ax^2) we would then say: =) phi(x) = (lambda^2)*e^(Ax^2) Now redefine f(x) and our constant "lambda" (which remember, does not depend on how we choose to name it), such that: f(x) = f(x)/(lambda) =) f(x) = phi(x) and thus, from here on, our f(x) represents the probability function which we want. Also, redefine lambda: lambda = lambda^(1/2) (remember: both are still constants) =) phi(x) = lambda*e^(Ax^2) and thus, we may now return to his line of working where f(x) represents this particular probability function: f(x) = lambda*e^(Ax^2) Allowing us set the integral from negative infinity to positive infinity of f(x) equal to 1.
This video is quite interesting, I got bored one afternoon and I searched the derivation for Normal distribution because I was learning probability questions involving Normal distributions in school where we use a table to find the values for standardized normal distribution where n=1 s=0 using a table . This is really good except that I lack the attention span to understand most of this
8:00 I am a bit confused a bit here. How did you go from *φ (x) = λ f(x)* to *φ (√x^2 + y^2 ) = λ f(√x^2 + y^2 )* Also according to your argument *φ (x) = λ f(x)* is only true for points on the x axis since *φ (x) = φ (√x^2 + 0^2 )*
+xoppa09 All these functions are of a single variable, no matter what name you give to that variable, call it 'x' call it '√x^2 + y^2', or 'y', or whatever. The observation we make when we see φ(x) = λf(x) is that whenever something is evaluated by φ, this is the same as it being evaluated by f times some multiplier λ. It probably would have been better if I had written φ(•) = λf(•) or drop the argument and write φ = λf instead of φ(x) = λf(x) to emphasize this
Excellent explanation, and a channel I'm looking forward to digging deeper! I'm a fellow medical student, with an engineering background so these videos really do intrigue me. Got a few questions... - What field within medicine interests you? I wonder if there's any area which is more conducive to your kind of conceptual understanding and deductive reasoning... I kind of put that part of my brain away the last few years, and only use it as a hobby like watching these videos haha - The video's derivation makes sense. And there's a sort of beauty of setting up the normal distribution from a darts analogy (i.e. probability falling off with distance). A rotational symmetry also makes sense. But what's not intuitive to me is why it is a 2D darts setup, and not a 1D or 3D darts setup? Not asking for a derivation, but just curious for any intuitive insights here.
+GypsyNulang I have the intuition that I'll become a pathologist someday although I've been told I have a surgeon's personality. I like pathology because the hours are regular and I'd have time to teach too. We could do a 3D board if you wanted but I like the 2D example because we have experience with that. Maxwell in his derivation works in three dimensions, thinking of gas diffusing from a central source in all three directions. The derivation is conceptually the same, save for a few constants: statistical independence of all coordinates and dependence only on distance from origin.
So now, 26 years after I first encountered this equation in college, I finally know where it came from. A cool thought experiment coupled with some “cosmetics” which (true to their name) conceal its true identity.
Well, good job in explaining this, but it leaves a big question unanswered (as do most "derivation" videos of the Gaussian distribution. Namely, you explained how to derive f(x), the pdf of the x coord of the dartboard, and this is "a" bell curve. We haven't shown this is also "the" Gaussian bell curve of the central limit theorem - it's conceivable that f(x) only roughly looks the Gaussian but is not identical. How do we show f(x) is the Gaussian of the clt?
😮😮😮👏👏👏👌 awesome... for years this has been troubling me how to derive normal distribution equation... trust me I had night mares... 😅😅😅 I am too bad with just mugging up... thanks a ton... I used to wonder how we are able to connect two independent sample space through mysterious Z... now its quite clear. 1. The sample space has to behave like a normal distribution phenomena 2. Anyway the probability density curve will have area as 1... and that makes me to understand t distribution even better... thank you...
Lots of assumption but worth it !! However for in the case of multidimensional scenario, x^2 + y^2 != r^2 so i think Gaussian distribution might need improvement.
i don't understand how can you say at 8:46 that f(sqrt(x^2+y^2)) is equal to f(x)f(y) just analysing a single case (if you put sqrt(x^2+y^2)=x this is true just in a single case and it0s the case you described, y=0)
That was done because he wanted to get rid of the function φ and express the equation in terms of the function f only. He used the specific case when y=0 and defined the constant λ as equal to f(y=0). He said that if φ(sqrt(x^2+y^2)) is true in general then it must be true in an specific case, when y=0. So, φ(x)=λf(x) for any variable x and it follows that φ(sqrt(x^2+y^2))=λf(sqrt(x^2+y^2))=f(x)f(y). You see at the end that λf(sqrt(x^2+y^2))=f(x)f(y) with no φ involved.
Because φ is the probability density function, so its units are (probability)/(area) (or probability per unit area). Therefore, the probability of a dart landing at some point is φ*dA, because you need to get rid of the area units at the bottom.
Great explanation! I followed everything except for the part where you introduced the integral for variance (around 28:00). Could someone clarify where the integral of x^2 * f(x) * dx comes from?
Basically the variance of random variable X is defined as the expected value of the squared deviation from the mean of X. E((X-mean)^2) = variance, where E is the expected value. The expected value is basically the same as the weighted mean, only the weighted mean is for discrete values of x, and the expected value is for continous functions of x, that can be integrated. The formula of the waited mean is the sum of all the occuring values multiplied by their weight(number of occurance), divided by the total number of weights(total occurances of each value). In the case of expected values of continous probability density functions, the weights are not the number of occurences of each value, rather the probabilites of each value(f(x)*dx), and since the probabilities add up to 1, the division by the total number of weights, or probabilities - in the continous case - gives only the integral of x multiplied by the probability of x through all values (E(X) = Integral of x*f(x)dx). Since the mean of the normal distribution function is 0, the squared deviation from the mean in this case gives only x^2, so E(X^2)=x^2*f(x)dx. I hope it's a bit more clear, altough I think was quite confusing, so you should read this through several times and really focus on every part, if you want to understand it :D.
Thanks for this - very well explained. However, one question at the back of my mind is that if you consider a non-zero mean, wouldn't you need to adapt the definition at 27:18 to be $$ \int (x - \mu)^2 f(x) dx $$ ? And wouldn't this complicate things a lot? Don't get me wrong, I understand the intuition behind just shifting the mean, it just seems like a potential snag that I would get caught on in an exam.
This video made me so happy, best 35 minutes of my day for certain, maybe my whole week. Cheers for this; if you see this comment 3 years later, you're a legend for taking the time to explain this so clearly.
Thank god someone cares to explain this equation that just floats around in the math realm with no explanation from teachers other than, "here!"
Come on, it is not an equation floating around. There are many derivations in books. Actually this derivation is rather too long.
@@univuniveral9713 The best derivations are the easiest to understand imo
@@wurttmapper2200 True
Just because you were too lazy to seek out a proof. Hey, but too easy to blame your teachers. Dufus.
@@donegal79 the job of a good teacher is to point a student in the right direction and not just hand wave complex topics.
Plus, he did go and look for the proof as is evident in him watching this video.
What I love about this presentation, unlike some others I have seen, is that it does not skip over any steps. Each step is very clear and easy to follow. Wonderful job sir., thank you.
This video is absolutely brilliant! I've always wanted to know how the normal distribution curve was derived, and your explanation was perfect! Thanks so much!!! Math is beautiful
"Maths is beautiful" your so right. 😁
@Ask why? ? differential equations. It's the maths for working out when more than one thing is changing at the same time. As opposed to normal calculus which only changes one thing. Like how far a ball goes if you throw it, if you change both the angle and how hard you throw.
J. I can’t really link your comments to your profile photo... it’s so illogical
@@stevenson720 I know my question is about 11 months too late, but regarding your reply, I.e. "differential equations ... for working out when more than one thing is changing...", etc. , isn't that actually what multivariable calculus is actually for?
I suppose it really depends on whether we're talking about ordinary diff equations or partial diff equations, since ODE's deal with one independent variable, while PDE's are for multiple independent variables. I'm no math major, just an independent learner and lover of maths, so my response might not be 100% accurate, but if it's not, anyone may feel free to correct me.
This is fantastic! Everything is explained and paced so well; no other video online has derived the normal distribution so clearly as you have.
It’s beautiful to observe that the number of UA-cam “likes” decrease as the video is of educational nature and not about useless make-up tutorials etc. This itself IS a proof that the number of curious people actually wanting to understand and therefore watching this helpful video completely are from the “other side” of the Gaussian Distribution. ;-) Thanks for the fantastic job!!
This video is a brilliant illustration of Einstein's famous sentence: "Everything should be made as simple as possible, but not simpler". The derivation is beautiful, elegant and crystal clear. Thanks so much for sharing your knowledge!
I'm so grateful that I found this absolute gem of UA-cam, keep posting videos. You probably inspired thousands of people to be more interested in math/science
My physics professor from Greece pronounced it "φ"
wrong, i'm pretty sure it is pronounced "φ"
@@yerr234 What's funny is that my professors here in Germany, even though one is from Russia and the other is a German, both pronounce it "φ".
You're all funny😂
It's φ not φ
"We are more likely to find a dot near the bulls eye." You've oblivious never seen my wife play darts.
+FriendEd
More likely to strike someone else in the eye...amirite?
wait what are you doing here
It was a good joke though...
But the last equation, the general one covers that too.
Bad at throwing darts?? The value of σ is more. Or if your good at throwing darts but your shots clusters somewhere other than bull's eye. Your μ will be different.
So never loose faith in maths, specially in general maths.
Dude this was a great video, keep up the great work!! I love how at 3 am in the morning I am binging on your videos, goes to show that you have skill
Fantastic explanation, step by step and does not assume the viewer knows any particular mathematical derivation. best video i've seen on the derivation of the normal distribution!
I'm from Brazil and i just found this class right here. I ask to my professor how to derivate this formula and she did not know it. This was one of the most impressive class that i ever saw, Thank you so much!!!!
Beautiful derivation of a ubiquitous formula: the hat trick comes at 22:48! To complete the discovery of this treasure, I am now deep diving to revise my Euler integral. Thanks so much!
Brilliant!!!! This is a MUST for any student of statistics. The way statistics are normally taught... many unanswered assumptions... which are answered by this video.
Great vid. Cleared up much of the mystery of where normal distribution comes from. Have forgotten most of my calculus but was still able to follow along!
Wow, this was awesome. I'm reading E. T. Jayne's Probability Theory. In Chapter 7, he performs this derivation, but as is often the case, he assumes the reader is as fluent as he is with functional analysis. This video really helped me fill in the gaps. Can't wait to watch the rest of your videos. Thanks a bunch!
I am incredibly new to statistics and have never actually taken a course, but I have taken physics and engineering courses that apply physics; it's pretty neat to see that the dirac delta function is a limiting case of the normal distribution in which lambda approaches infinity, which I realized as soon as you showed how lambda transforms the shape of the graph. Very cool video!
I have always wondered about this formula. Your explanation is the most concise and understandable one even to a novice like me. Thanks a million.
This is the most clear and complete derivation of the Normal distribution I've seen.
Thanks for sharing
One of the best videos on the internet, love it!
Esto es bellisimo, me ayudo de una manera impresionante en mi clase de diseño de experimentos. Simplemente gracias!, no lo hubiera logrado sin ti
This is beautiful, it helps me in an impressive way in my experiment design class. Simply thank you!, I wouldnt achieve it without you
I love this video, this derivation is spectacular, I love it when mathematics links various seemingly unrelated concepts with each other and yields this beauty. That being said, I also think Gauss' proof in the article you provide is far easier and more accesible to students like myself.
I don't know what to say but you just saved my life. I have been looking for a proper derivation and not the 5 mins ones for months. Thank you thank you thank you so so so much.
Thank you! Most simple and well demonstrated video about the subject that I've seen. Congratulations!
Great works in this video! After watching this video, I just can't appreciate enough the original inventor of this function Carl Friedrich Gauss!
Wow, amazing explanation that cleared up everything about the normal distribution for me! You’re calm way of teaching is very clear and enjoyable, thank you!
Amazing, finally someone that explains completely and holistically how to derive the Gaussian density function. Thank you!
i was so confused how the "2" comes in the formula
now i finally understand
thanks :)
Man, what an incredible video! Loved your derivation.
I'm software developer. I'm pretty comfortable with discrete math and terrible with calculus. But your explanation is so clear that I was able to understand most of it.
Thank you for such high quality content. This video deserves more views!
I'm subscribed (which happens rarely!).
Really well explained. I'm relatively new to this type of thinking and it was illuminating! On the fly definitions of new functions that lead you in any direction you'd like, seems really powerful, but also like a puzzle. Thanks for making this video!
For the transition from y = 0 to the general case, the 1D equation can be generalized to 2D due to radial symmetry, which makes the x axis equivalent to any other line going through (0,0).
Regarding the number of dimensions, a minimum of two is necessary to specify coordinate independence and radial symmetry, which together give the form of an exponential.
Lovely, unique video. Thanks!
best video for derivation of gaussian distribution ever.
Seriously Mathoma... thank you so much sir. Thank you. You are doing God's work. You will ride shiny and chrome in Valmatha. Excellent video.
Absolutely brilliant! You make Mathematics look like what it's meant to be, simple. Thank you for this great video.
It took me 2 days to understand the concept behind this topic for my statistics class. This video cleared everything up for me. Thx!!
Amazing... This video really helps to understand the Gaussian distribution a lot better. Thank you.
Thank you for taking your time and explaining it beautifully!
0:42 Thought Experiment: Dart Board
1:57 Probability Denisty Function; Fi
I have been looking for a video like this for so long, a clear derivation from scratch of the normal distribution
Thank you for the brilliant derivation from nearly the First Principles. Thank you indeed.
I deeply expect that there should have been a "History of Mathematics" YT channel, along with providing the reasons for the clever decisions taken at crucial steps to derive historically important equations. These steps have nothing to do with computation, but just a "Leap of Intelligence", because of which mathematics has prospered for so long. Began with Pythagoras's proof of root2 being an irrational number.
This is an outstanding video. You have explained all the details with clarity. Thank you!
We need great explainers like you... Awesomely explained.
I actually enjoyed watching this video. I expected to learn, but i never expected to enjoy the derivation of PDF. This was fun!
I have a neat trick for resolving 14:00. Replace the unknown g(.) by (uoh)(.) where h(.) is squaring function.
wow i am watching your video at 12/25/2020 and i suppose your video is my Christmas gift. so beautiful explanation.
this is a fantastic lecture and literally reveals mathemagic of normal distribution curve, i am going to see this again and again and...
We love you from the bottom of our hearts. Everything is soooooooooo clear unlike other videos on youtube. I am trying to learn data analytics, on my own from youtube for free Cause I want a neww skill your standard deviation
Awesome explanation...I have been trying to understand the function behind normal curve all this while and this is so beautifully explained...thanks a ton
In your opinion, what looks nicer when written out: Gaussian distribution or the Schrödinger equation? And in general, what formula do you think is most aesthetically pleasing? Also, if you haven't seen it yet, look up Gauss's signature. It's one of the best things ever.
It's a very good video. Thank you. I just wish the frequent commercials weren't as loud.
I like how you're writing the integral signs! Anyway this is a super clear video, thank you so much! I'm reading Maxwell's 1859 paper and I wasn't sure where the Ce^Ax^2 came from.
Yes. This video Is perfect. I've been looking for this for years. You are wonderful. Thank you so much. Excellent explanation!!!
27:21 I am unable to get the intuition about the variance integral part - how the formula came up?
The formula tries to check how much data has varied from the mean value. The square is to generalize those which have varied less (-ve) and more (+ve) to the mean
So the formula for variance is Summation of all the Squared Deviations; ie Sum ([X-Xbar]^2). In a continuous setting, we integrate rather than sum, therefore we integrate ([X-Xbar]^2) and multiply it with the PDF. As Xbar, or the mean, is equal to 0, we only end up integrating [X]^2*pdf from negative infinity to positive infinity.
Hope this helps man.
en.wikipedia.org/wiki/Variance#Absolutely_continuous_random_variable
Thanks so much for the clarity of the video.
Amazingly satisfying video; I've enjoyed your videos for a long time and this was particularly good. Everything is explained at just the right level and the derivation was so logical it all felt obvious by the end. Thank you for putting so much effort into your videos!
+Will Price
That's very kind of you to say. My pleasure.
This was wonderful, thank you, I'm excited to see the rest of your channel now!
hey great idea for a video. i found in undergrad that my stats prof didnt really care to elaborate on this, and so stats ended up being my least favorite math class. turns out you need to understand this stuff to do my actual job, so this is much appreciated
14:15 How can squaring remove the root? Shouldn't it be sqrt(x^4+y^4) if
x-->x^2 and y-->y^2?
He said Exponentiating them.
you nailed it with this video tho! it's so cool to see that this derivation is actually an insight derived from a multivariable case.
Loved it. And to think that's the just beginning....
Class teacher often omit the derivation of Normal Distribution. I always wonder how the bell curve formula is derived. Here the answer I am. Thanks a lot.
Crystal clear explanation, thank you Sir for the great work!
Clear clean well described well paced, excellent. Thank you.
Terrific derivation! The only potentially ambiguous part is at 21:07 when "A" is written as the total area under the curve, whilst "A" is still written at the top right as "A = -h^2". Probably not confusing to anyone who understands the rest of the derivation, but it still bothers me a bit to have "A" on the screen twice, with two different usages. 😉
You are awesome bro..
Finally someone cared about proof..
I am not sure if the equation written at 21:37 is correct; isn't φ(x) the probability distribution function and not f(x)? Why is f(x)dx being integrated?
It is correct, but only because it doesn't matter what he chooses to represent the particular probability function (may it be a distribution or density function for discrete or continuous data respectively), or the proportionality constant "lambda".
I suppose if you really wanted to be pedantic and make sure that everything he is saying is correct, then after the line of work:
f(x) = lambda*e^(Ax^2)
we would then say:
=) phi(x) = (lambda^2)*e^(Ax^2)
Now redefine f(x) and our constant "lambda" (which remember, does not depend on how we choose to name it), such that:
f(x) = f(x)/(lambda)
=) f(x) = phi(x)
and thus, from here on, our f(x) represents the probability function which we want.
Also, redefine lambda:
lambda = lambda^(1/2)
(remember: both are still constants)
=) phi(x) = lambda*e^(Ax^2)
and thus, we may now return to his line of working where f(x) represents this particular probability function:
f(x) = lambda*e^(Ax^2)
Allowing us set the integral from negative infinity to positive infinity of f(x) equal to 1.
Beautiful! Thank you so much. I wish this was around when I went through my MS.
oh, thats how you get e^-(x^2)
+Mi Les
Indeed.
This video is quite interesting, I got bored one afternoon and I searched the derivation for Normal distribution because I was learning probability questions involving Normal distributions in school where we use a table to find the values for standardized normal distribution where n=1 s=0 using a table . This is really good except that I lack the attention span to understand most of this
8:00 I am a bit confused a bit here. How did you go from
*φ (x) = λ f(x)*
to
*φ (√x^2 + y^2 ) = λ f(√x^2 + y^2 )*
Also according to your argument *φ (x) = λ f(x)* is only true for points on the x axis
since *φ (x) = φ (√x^2 + 0^2 )*
+xoppa09
All these functions are of a single variable, no matter what name you give to that variable, call it 'x' call it '√x^2 + y^2', or 'y', or whatever. The observation we make when we see φ(x) = λf(x) is that whenever something is evaluated by φ, this is the same as it being evaluated by f times some multiplier λ. It probably would have been better if I had written φ(•) = λf(•) or drop the argument and write φ = λf instead of φ(x) = λf(x) to emphasize this
this is so well presented, thank you so much for this
most beautiful prove I saw!!!
Excellent explanation, and a channel I'm looking forward to digging deeper! I'm a fellow medical student, with an engineering background so these videos really do intrigue me. Got a few questions...
- What field within medicine interests you? I wonder if there's any area which is more conducive to your kind of conceptual understanding and deductive reasoning... I kind of put that part of my brain away the last few years, and only use it as a hobby like watching these videos haha
- The video's derivation makes sense. And there's a sort of beauty of setting up the normal distribution from a darts analogy (i.e. probability falling off with distance). A rotational symmetry also makes sense. But what's not intuitive to me is why it is a 2D darts setup, and not a 1D or 3D darts setup? Not asking for a derivation, but just curious for any intuitive insights here.
+GypsyNulang
I have the intuition that I'll become a pathologist someday although I've been told I have a surgeon's personality. I like pathology because the hours are regular and I'd have time to teach too.
We could do a 3D board if you wanted but I like the 2D example because we have experience with that. Maxwell in his derivation works in three dimensions, thinking of gas diffusing from a central source in all three directions. The derivation is conceptually the same, save for a few constants: statistical independence of all coordinates and dependence only on distance from origin.
Ah thanks yea the multiple axes demonstrate the statistical independence
This was truly beautiful. Thank you so much for the great content!
Absolutely! I enjoy this fresh look of the derivation of this class of Gaussian functions. I like the way you explained
This was a great explanation! Thank you!
excellent derivation very intuitive,i needed it for understanding gaussian regression
Beautiful explanation
I just started a Patreon if you appreciate the work done on this channel: www.patreon.com/Mathoma
Thanks for viewing the channel!
So now, 26 years after I first encountered this equation in college, I finally know where it came from. A cool thought experiment coupled with some “cosmetics” which (true to their name) conceal its true identity.
Well, good job in explaining this, but it leaves a big question unanswered (as do most "derivation" videos of the Gaussian distribution. Namely, you explained how to derive f(x), the pdf of the x coord of the dartboard, and this is "a" bell curve. We haven't shown this is also "the" Gaussian bell curve of the central limit theorem - it's conceivable that f(x) only roughly looks the Gaussian but is not identical. How do we show f(x) is the Gaussian of the clt?
excellent pesentation of the derivation. congrats
Thanks. Very clearly explained.
Thanks for this! Really, really helpful! Will you make videos about other distributions?
Thanks a lot! the explanation is fantastic!
30:51 Why does that satisfy normalization condition? Could you explain?
😮😮😮👏👏👏👌 awesome... for years this has been troubling me how to derive normal distribution equation... trust me I had night mares... 😅😅😅 I am too bad with just mugging up... thanks a ton... I used to wonder how we are able to connect two independent sample space through mysterious Z... now its quite clear. 1. The sample space has to behave like a normal distribution phenomena 2. Anyway the probability density curve will have area as 1... and that makes me to understand t distribution even better... thank you...
Lots of assumption but worth it !! However for in the case of multidimensional scenario, x^2 + y^2 != r^2 so i think Gaussian distribution might need improvement.
i don't understand how can you say at 8:46 that f(sqrt(x^2+y^2)) is equal to f(x)f(y) just analysing a single case (if you put sqrt(x^2+y^2)=x this is true just in a single case and it0s the case you described, y=0)
can it be explained by treating y like a parameter?
That was done because he wanted to get rid of the function φ and express the equation in terms of the function f only. He used the specific case when y=0 and defined the constant λ as equal to f(y=0). He said that if φ(sqrt(x^2+y^2)) is true in general then it must be true in an specific case, when y=0. So, φ(x)=λf(x) for any variable x and it follows that φ(sqrt(x^2+y^2))=λf(sqrt(x^2+y^2))=f(x)f(y). You see at the end that λf(sqrt(x^2+y^2))=f(x)f(y) with no φ involved.
Awesome explanation ,U earned a subscriber
Thanks! Very nicely explained indeed.
Great video!! Helped me immensely in understanding where the normal dist. pdf came from. Thnx a lot😆😆
Hi! Pablo, from Spain.. :) Min. 2:34, Why phi times dA? Thanks in advance!
Because φ is the probability density function, so its units are (probability)/(area) (or probability per unit area). Therefore, the probability of a dart landing at some point is φ*dA, because you need to get rid of the area units at the bottom.
Great explanation! I followed everything except for the part where you introduced the integral for variance (around 28:00). Could someone clarify where the integral of x^2 * f(x) * dx comes from?
Basically the variance of random variable X is defined as the expected value of the squared deviation from the mean of X. E((X-mean)^2) = variance, where E is the expected value. The expected value is basically the same as the weighted mean, only the weighted mean is for discrete values of x, and the expected value is for continous functions of x, that can be integrated. The formula of the waited mean is the sum of all the occuring values multiplied by their weight(number of occurance), divided by the total number of weights(total occurances of each value). In the case of expected values of continous probability density functions, the weights are not the number of occurences of each value, rather the probabilites of each value(f(x)*dx), and since the probabilities add up to 1, the division by the total number of weights, or probabilities - in the continous case - gives only the integral of x multiplied by the probability of x through all values (E(X) = Integral of x*f(x)dx). Since the mean of the normal distribution function is 0, the squared deviation from the mean in this case gives only x^2, so E(X^2)=x^2*f(x)dx. I hope it's a bit more clear, altough I think was quite confusing, so you should read this through several times and really focus on every part, if you want to understand it :D.
no words to thank you enough
4:20 (nice) wouldn't that also rotate the box by θ' - θ?
Why didn't we say that lambda =f(0) at 18:15
Thanks for this - very well explained. However, one question at the back of my mind is that if you consider a non-zero mean, wouldn't you need to adapt the definition at 27:18 to be $$ \int (x - \mu)^2 f(x) dx $$ ? And wouldn't this complicate things a lot?
Don't get me wrong, I understand the intuition behind just shifting the mean, it just seems like a potential snag that I would get caught on in an exam.
Yes, even I have this doubt.