Extrapolation is actually necessary in certain circumstances though - for example predicting growth of global human population, economic forecasts, environmental forecasts regarding climate change.... anything that has to do with the future.
Nicholas Cage movies are correlated by yet another unmentioned variable: summer. Nicholas Cage is an action movie star. Action movies are generally targeted for summer releases. Summer is also hot, which is the cause behind air conditioner sales and swimming, the latter of which is of course the cause of drowning.
That's true, but the data shows a close correlation over multiple years, not just over the seasons of a given year. It just so happens that the summers of years with more Nicholas Cage movies also happen to have more drownings.
*Me:* I used to think correlation implied causation. *Me:* Then I watched this video. Now I don't. *Friend:* Sounds like the video helped. *Me:* Well, Maybe.
The video explains that it's not because two elements are correlated that one is the cause of the other. One '''can''' be the cause, but it's not logical to imply it just from their correlation. It was not the floor itself that broke the glass even though it is related to the breaking, it was it's impact with the glass, '''caused''' by gravity.
A class on non linear relationships would be FANTASTIC :) And more classes in general (e.g., on general versus mixed effects models; GAMs etc...) Thank you for your dynamism!
I feel some people go so far in this argument that they seem to argue the correlation disproves causation. Eg. "thats only correlation it doesnt prove causation, obviously you are wrong" Yes correlation doesnt prove causation, but it most definitely does not disprove causation. Further it might suggest causation, or that a 3rd factor is causing both phenomena to occur. Its frustrating to give data in an argument, to have the other side counter with, "thats only correlation, it doesn't prove causation, you are wrong."
Every time I see one of these videos I look at the view count and know that there's that many more people out there that are better educated about this topic and that makes me very optimistic for the future keep up the great work guys
This was the funniest Crash Course video I've ever seen. Her comedic timing is excellent. Though I still don't know if that clever mayor was a man or a woman.
I've seen people both conflate correlation with causation in situations that are clearly coincidence and insist that correlation does not equal causation when the pattern of cause and effect are obvious.
So this was great. You are definetly one of my favorite crash course hosts. And I took statistics back in 1994. I have one question that boggles me. When and who is right, who determines the reality or that there is causation? Example .... cigarette smoking and lung health. The negative effects are clearly visible, the correlation is there ... but is it really the cause? When and how do we get to a positive causality? Or is it left to the interpreter? Or is it just all relative? Or by the end of the day it's meaningless and everyone can make the statement "correlation doesn't equal causality" and your data and beautiful charts and correlations just fizzle out?
That's the tricky part! Ultimately they all need to be interpreted. Overall, there is no true "proof", just higher levels of confidence. I am confident that the city of Paris exists, even though I've never been there. The process generally starts by asking "is this even possible?" and "Does this make some sense?" Then you can go back and try to find some other cause of the data you got. Eventually, you have to do experiments carefully. But even well-planned experiments can have hickups and biases (there have been many cases of seemingly high-confidence experiments not being repeatable by other professionals). Often, multiple experimenters need to come up with the same results on their own (and usually with their own equipment) before the scientific community is convinced. Overall, it's a difficult and time consuming process.
In health data like the lung example, there is a set of criteria called the Bradford-Hill criteria. Google it. This is criteria for determining if something can be considered causation. It is not a checklist: you still need to do your own scientific interpretation. But it’s a good way to get an idea of whether the data your looking at implies causation or not. The criteria are: effect size, consistency, specificity, temporality, biological plausibility, dose-response relationship, coherence, analogous results. Interestingly, Bradford Hill who came up with this list, is the same Hill who co-authored the original Doll and Hill paper that established the linked between smoking and lung cancer!
...how do you fit a regression line through a circle (or fat ellipse) on a 2D-scattered, plot... ...how do you define accuracy where there are fewer data points, even though the fitted-curve looks similar, (do you overlay random information certitude measure sigma bars)... *_...(in case you missed the first question: flip the plot axes for a different regression line...)_*
I love this series! However, you made one, small lie: R^2 does not have to be between zero and one, but can in fact be negative. You spoke of the mx + b, but failed to mention what value it has to determine b (and if chose horribly wrong, it can give you negative R-values, due to estimate a model that is worse than random). Keep up the series! :)
I was TRICKED into watching this by the title. How hard would it be to add, "WARNING! THIS IS STATISTICS, DWEEB" to what appears on my temptation screen? It was really good.
Anecdotally, after playing Simpsons: Hit & Run (a GTA clone), I genuinely drove more recklessly for a little while. Not like I got into an accident, but like I was cutting corners tighter, and being a little heavier on the pedal. I had to work at it to knock it off. Really really good game though.
It doesn't really matter either way. The general consensus is that the last letters from the latin alphabet, i.e. x, y and z are being used as placeholderds for unknown quantities, whereas letters from the beginning (e.g. a, b and c) or middle (e.g. k, l, m and n) are being used as placeholders for known quantities (to be supplied or deduced when doing a specific example). The placeholders for know quantities may be different in different countries for many reasons (ease of pronounciation, legibility, tradition, etc.). Tradition usually also means that often the same equation uses different placeholders in math and physics. Example: in Math class the may use y = ax + b, in Physics class they may use y = mx + c, just because ... (and then of course in the kinetic equations this becomes e.g. v = at + v0 representing physical quantities).
Correlation does not neccesarily state that causation is found between two variable. However. don't walk away thinking correlation disproves causation. This isn't politics. There are more than two possibilities. (There are in politics too, but ignore that.) Thanks, and have a good day. As a final note: Time taken to get from point a to point b is negatively correlated with speed. There is (by definition no less) causation there.
Sango, that's a good tip. But I fear that addressing people as the "scientifically illiterate" might not be the best way to get your message across. (What I would give for Crash Course: Rhetoric).
It is absolutely true that everyone begins illiterate, and there should be no shame in that. However, referring to people as such can cause them to misinterpret your message as being condescending, even though you had no intention to be that way. Regardless, they are now slighted, and in retaliation, they ignore your advice, no matter how reasonable it was.
You have mentioned that the steep line can have a strong correlation but there was no support of the graphics. Emphasis for users: the slope and correlation are different
Squared correlation r^2 Line of regression Can anyone explain a little more in depth standard deviation? Im still not sure what information it tells us in a scatter plot
The example of changing the units on the y-axis is only relevant if you're not doing your dimensional analysis properly. If the slope of the feet-feet plot is 0.5, then the slope of the meter-feet plot is 0.15m/foot=0.5
Gain in my knowledge is perfectly correlated with the number of crash course videos I watch and shows the value of absolute +1 as the correlation coefficient #CrashCourse ..... 😁😁😁
If A caused B then there is a correlation between A and B. The rising of the Sun caused the eating of an ice cream by John. Therefore, there is a correlation between the rising of the Sun and the eating of an ice cream by John. My question is, how would you quantify those events and plot the correlation between them on a graph? Would I count the number of times these events occurred? What if an event only causes another once? What if John died after the first ice cream? Can we still say that there was a correlation?
Wait... Technically everything is connected. Maybe the relationship between 2 variables are correlated even tho it doesn't make sense that they cause each other, but that happens because these 2 variables are connected to other variables that we didn't observe yet these variables can indirectly influence the relationship between the main 2 variables we are comparing. So I guess that means, one way or another, correlation DOES imply causation. Error 404
@crash course team, not all the graphs in the datasaurus dozen shown in the end doesn't seems like having same correlation coefficient. Few look like having r=1, few r=0. Please correct me if I'm wrong
I watched this video without having seen the previous ones, and spent a considerable amount of time wondering "what the heck is an 'old faithful eruption' ?"
(For those who have the same problem: "Old Faithful" seems to be the name of a geyser. (I don't know where it is, but when an English UA-cam show refers to a location, person, event or sports ritual you have never heared of, you can be pretty sure it's in North America.)
"mx + c" is also reasonable in the sense that "c" is often used to refer to some "constant". This is also the explanation for e=mc^2. Because the speed of light in a vacuum is a constant.
*_...there'd be a negative-correlation where reducing air conditioning increases swimming..._* *_...or, an overriding 'cause' leading to watching-speeding or doing-it, another, negrelation..._* *_...so...what's the mathematically-concisely-stated-statistical-rule for causality-guessing..._* *_...(making statistics, like modulo arithmetic: where compounded moduli may get better)..._*
When it's hot, people with no A.C. tend to go to the movies. Movie theaters are usually quite air conditioned and you get to enjoy it for a couple of hours.
optimum angle for maximum range. Range in terms of angle would have a turning point around 45 degrees where it reaches its max range then goes back down. one example
you can use a parabola to fit data. It would be polynomial regression where your x's are taken to various powers. Sometimes it's really useful to do so, since often data isn't perfectly linear.
Can we do a talk on how you DO identify causation, not just rule out plausible causal relations? Or are we taking a Humean view of causation and saying there is no real force of causation at all, just a fixed regularity that humans imagine happens?
This needs to be mandatory viewing for EVERYONE.
I second that!
"Correlation does not equal causation" was my old stats teacher's favourite phrase along with "always interpolate, never extrapolate." :)
Extrapolation is actually necessary in certain circumstances though - for example predicting growth of global human population, economic forecasts, environmental forecasts regarding climate change.... anything that has to do with the future.
Post hoc ergo propter hoc!
Bah. I know my rock keeps away tigers because I have never seen a tiger for as long as I have had it.
SilortheBlade Makes sense to me
Nicholas Cage movies are correlated by yet another unmentioned variable: summer. Nicholas Cage is an action movie star. Action movies are generally targeted for summer releases. Summer is also hot, which is the cause behind air conditioner sales and swimming, the latter of which is of course the cause of drowning.
Pfhorrest Or it could be that people who have endured a Nicholas Cage movie are more likely to drown themselves ...
That's true, but the data shows a close correlation over multiple years, not just over the seasons of a given year. It just so happens that the summers of years with more Nicholas Cage movies also happen to have more drownings.
This has been my favorite CrashCourse season by far. Really enjoying the material and the host!
*Me:* I used to think correlation implied causation.
*Me:* Then I watched this video. Now I don't.
*Friend:* Sounds like the video helped.
*Me:* Well, Maybe.
lol. Well, probably.
The video explains that it's not because two elements are correlated that one is the cause of the other. One '''can''' be the cause, but it's not logical to imply it just from their correlation. It was not the floor itself that broke the glass even though it is related to the breaking, it was it's impact with the glass, '''caused''' by gravity.
XKCD is a pretty good comic :)
Kachimbo somebody missed the joke
Herodotus Von 8428 no, someone got the joke, but felt the need to expand our knowledge.
A class on non linear relationships would be FANTASTIC :) And more classes in general (e.g., on general versus mixed effects models; GAMs etc...) Thank you for your dynamism!
Better explanation then my university level stats class. 👍
Everyone needs to see this! Just because things seem connected on the surface doesn’t mean they’re related and Visa Versa!
psst. its vice versa, not visa versa
and if they're not connected then they DON'T CORRELATE. this shit's a red herring.
@@improover113 talking specifically about causal relationships, as the phrase states explicitly
Puppy cat! I didn't know that they'd made a stuffed animal of him. This has greatly improved my day.
Crash Course, thank you so much. This awesome course is definitively above the curve!
When she apologises for using imperial units......
I haven't watched Nicholas Cage movies, AND I haven't drowned. Aha!
This is the best tutorial I have watched on this topic.
without you guys i would not pass my exams thank you so much
"Air Cons, and Con Airs"
Amazing
Watching Stat for fun again.
I feel some people go so far in this argument that they seem to argue the correlation disproves causation.
Eg. "thats only correlation it doesnt prove causation, obviously you are wrong"
Yes correlation doesnt prove causation, but it most definitely does not disprove causation. Further it might suggest causation, or that a 3rd factor is causing both phenomena to occur. Its frustrating to give data in an argument, to have the other side counter with, "thats only correlation, it doesn't prove causation, you are wrong."
EasySnake 100% agree
i've seen this too! It irks me to no end.
This is crash course statistics and statistics is all about probability?
Every time I see one of these videos I look at the view count and know that there's that many more people out there that are better educated about this topic and that makes me very optimistic for the future keep up the great work guys
“Mr. Fluffy misses you.”
*pouts thinking of the cat I don’t have missing me*
"..if people blink more when they're lying!"
Our Professor: 😳
This needs to be essential viewing for EVERYONE.
Thank you so much for sharing. You're so much better at explaining than my professor.
This was the funniest Crash Course video I've ever seen. Her comedic timing is excellent. Though I still don't know if that clever mayor was a man or a woman.
I've seen people both conflate correlation with causation in situations that are clearly coincidence and insist that correlation does not equal causation when the pattern of cause and effect are obvious.
I wish all my scatterplots ended up making pictures of dinosaurs.
So this was great. You are definetly one of my favorite crash course hosts. And I took statistics back in 1994. I have one question that boggles me. When and who is right, who determines the reality or that there is causation?
Example .... cigarette smoking and lung health. The negative effects are clearly visible, the correlation is there ... but is it really the cause? When and how do we get to a positive causality?
Or is it left to the interpreter? Or is it just all relative? Or by the end of the day it's meaningless and everyone can make the statement "correlation doesn't equal causality" and your data and beautiful charts and correlations just fizzle out?
That's the tricky part! Ultimately they all need to be interpreted. Overall, there is no true "proof", just higher levels of confidence. I am confident that the city of Paris exists, even though I've never been there. The process generally starts by asking "is this even possible?" and "Does this make some sense?" Then you can go back and try to find some other cause of the data you got. Eventually, you have to do experiments carefully. But even well-planned experiments can have hickups and biases (there have been many cases of seemingly high-confidence experiments not being repeatable by other professionals). Often, multiple experimenters need to come up with the same results on their own (and usually with their own equipment) before the scientific community is convinced. Overall, it's a difficult and time consuming process.
In health data like the lung example, there is a set of criteria called the Bradford-Hill criteria. Google it. This is criteria for determining if something can be considered causation. It is not a checklist: you still need to do your own scientific interpretation. But it’s a good way to get an idea of whether the data your looking at implies causation or not. The criteria are: effect size, consistency, specificity, temporality, biological plausibility, dose-response relationship, coherence, analogous results. Interestingly, Bradford Hill who came up with this list, is the same Hill who co-authored the original Doll and Hill paper that established the linked between smoking and lung cancer!
At 0:27, it must have taken everything you had to not blink.
Thank u Crash Course
Islam xDDDD
Как же замечательно вы рассказываете! Даже переводить ничего не надо! (Russian is deliberate here)
Love this upload 😍
...how do you fit a regression line through a circle (or fat ellipse) on a 2D-scattered, plot...
...how do you define accuracy where there are fewer data points, even though the fitted-curve looks similar, (do you overlay random information certitude measure sigma bars)...
*_...(in case you missed the first question: flip the plot axes for a different regression line...)_*
wow! Thank you
Excellent video! Thank you!!!
I love this series! However, you made one, small lie: R^2 does not have to be between zero and one, but can in fact be negative.
You spoke of the mx + b, but failed to mention what value it has to determine b (and if chose horribly wrong, it can give you negative R-values, due to estimate a model that is worse than random).
Keep up the series! :)
Squares of real numbers are always nonnegative, by definition. They can never be less than zero -- the square of -5 is 25, for example.
I was TRICKED into watching this by the title. How hard would it be to add, "WARNING! THIS IS STATISTICS, DWEEB" to what appears on my temptation screen?
It was really good.
Anecdotally, after playing Simpsons: Hit & Run (a GTA clone), I genuinely drove more recklessly for a little while. Not like I got into an accident, but like I was cutting corners tighter, and being a little heavier on the pedal. I had to work at it to knock it off. Really really good game though.
I was JUST reading up on this in class! 😂
Thank you!!!
Learned so much from this video.
Love this video and the channel, also - @1:43 You've spelled eruptions wrong...
y = mx + b , is this some American standard? In Sweden it's y=kx+m
It doesn't really matter either way. The general consensus is that the last letters from the latin alphabet, i.e. x, y and z are being used as placeholderds for unknown quantities, whereas letters from the beginning (e.g. a, b and c) or middle (e.g. k, l, m and n) are being used as placeholders for known quantities (to be supplied or deduced when doing a specific example). The placeholders for know quantities may be different in different countries for many reasons (ease of pronounciation, legibility, tradition, etc.). Tradition usually also means that often the same equation uses different placeholders in math and physics. Example: in Math class the may use y = ax + b, in Physics class they may use y = mx + c, just because ... (and then of course in the kinetic equations this becomes e.g. v = at + v0 representing physical quantities).
Correlation does not neccesarily state that causation is found between two variable.
However. don't walk away thinking correlation disproves causation. This isn't politics. There are more than two possibilities. (There are in politics too, but ignore that.) Thanks, and have a good day.
As a final note: Time taken to get from point a to point b is negatively correlated with speed. There is (by definition no less) causation there.
Sango, that's a good tip. But I fear that addressing people as the "scientifically illiterate" might not be the best way to get your message across. (What I would give for Crash Course: Rhetoric).
Everyone was illiterate (scientific and otherwise) at one point. It is one's duty to make sure they do not continue to be.
There is no causation only chaos.
It is absolutely true that everyone begins illiterate, and there should be no shame in that. However, referring to people as such can cause them to misinterpret your message as being condescending, even though you had no intention to be that way. Regardless, they are now slighted, and in retaliation, they ignore your advice, no matter how reasonable it was.
kaizersabre, there is no Dana, only ZOOL.
That's not the graph Jim Carrey and Jenny McCarthy showed me.
Comment containing the word EVERYONE in caps lock.
Child Fs but why
Ok, you talked me into it.
containing the word EVERYONE in caps lock
Me: focus, you have a test this week
Also me: OMG PUPPYCAT!!
Love the series!!!
You have mentioned that the steep line can have a strong correlation but there was no support of the graphics. Emphasis for users: the slope and correlation are different
Good episode, but some things would need exercise and ‘usage’ in order to be memorized well and longer-term, like r and r squared.
Squared correlation r^2
Line of regression
Can anyone explain a little more in depth standard deviation? Im still not sure what information it tells us in a scatter plot
I am also looking for that :/
this was an awesome video
The example of changing the units on the y-axis is only relevant if you're not doing your dimensional analysis properly. If the slope of the feet-feet plot is 0.5, then the slope of the meter-feet plot is 0.15m/foot=0.5
Just noticed puppycat on her table! 💗
Thank you for thissss!!
Gain in my knowledge is perfectly correlated with the number of crash course videos I watch and shows the value of absolute +1 as the correlation coefficient #CrashCourse ..... 😁😁😁
who is here for school
Loved it!
Please do more literature!!
Great video!!!😊
While taking my stats course I started sleep talking and explained empirical rule to my mon
很棒的视频, 对学习统计学非常有帮助
Thank You.
3:12
> Hummer, the epitome of in-your-face Americanness
> Russian license plate
If A caused B then there is a correlation between A and B.
The rising of the Sun caused the eating of an ice cream by John.
Therefore, there is a correlation between the rising of the Sun and the eating of an ice cream by John.
My question is, how would you quantify those events and plot the correlation between them on a graph? Would I count the number of times these events occurred? What if an event only causes another once? What if John died after the first ice cream? Can we still say that there was a correlation?
now go teach the media this so they can stop blaming video games for all the worlds problems
does the "r²=0.7" mean that we could predict accurately by 70% ?
yes
Wait... Technically everything is connected. Maybe the relationship between 2 variables are correlated even tho it doesn't make sense that they cause each other, but that happens because these 2 variables are connected to other variables that we didn't observe yet these variables can indirectly influence the relationship between the main 2 variables we are comparing. So I guess that means, one way or another, correlation DOES imply causation. Error 404
Hello great video
@crash course team, not all the graphs in the datasaurus dozen shown in the end doesn't seems like having same correlation coefficient. Few look like having r=1, few r=0. Please correct me if I'm wrong
We don't predict the temperature in Fahrenheit we calculate it using the formula (c*9/5)+32
EXCELLENT!
I watched this video without having seen the previous ones, and spent a considerable amount of time wondering "what the heck is an 'old faithful eruption' ?"
(For those who have the same problem: "Old Faithful" seems to be the name of a geyser. (I don't know where it is, but when an English UA-cam show refers to a location, person, event or sports ritual you have never heared of, you can be pretty sure it's in North America.)
I don't know, Nic Cage may be dragging people to the deep after they see his movies. The evidence is there.
Those movie computer tick noises (when charts are presented) drive me mad, and I don't even have EQ in my setup to damp them down. Good vid though!
Y = mx + b?i thought it was c
Phony Aardvark i learned it as y= ax+b
Bryan, what does m stand for? The mmmslope? (I actually don't know the answer, now that I think about it)
O_O *head-explosion*
I know it was c (at least in my part of the world)
"mx + c" is also reasonable in the sense that "c" is often used to refer to some "constant". This is also the explanation for e=mc^2. Because the speed of light in a vacuum is a constant.
*_...there'd be a negative-correlation where reducing air conditioning increases swimming..._*
*_...or, an overriding 'cause' leading to watching-speeding or doing-it, another, negrelation..._*
*_...so...what's the mathematically-concisely-stated-statistical-rule for causality-guessing..._*
*_...(making statistics, like modulo arithmetic: where compounded moduli may get better)..._*
Any chance of crash course architecture (history of?)
When it's hot, people with no A.C. tend to go to the movies. Movie theaters are usually quite air conditioned and you get to enjoy it for a couple of hours.
2:56 to the height of the Holy Spirit- 😮💨
I’ll have you know that my cat, Mr. Whiskers, loves me.
Yes. I am so sick of hearing people not know that correlation does not equal causation
The first eruption scatter plot has a typo
i love this
Are regression lines ever parabolic?
What would be some examples if so?
optimum angle for maximum range. Range in terms of angle would have a turning point around 45 degrees where it reaches its max range then goes back down. one example
you can use a parabola to fit data. It would be polynomial regression where your x's are taken to various powers. Sometimes it's really useful to do so, since often data isn't perfectly linear.
This was very interesting...though, I wonder, just how significant it is ? Can you give me a chi squared on that ?
wish you'd touch on poker. Math and Data is very important in poker
Its was hillarious ,the data present by the reporter.
I love this chanel
1:20 They spelt eruptions wrong on the y-axis...
In Pearsons study, did he take into account that people often shrink as they get older?
The Bee and Puppy-cat doll in the back is sooo cute (๑>◡
The narrator is very easy to listen too. Even I understood the content
Cool-Cage Act; hilarious.
Can we do a talk on how you DO identify causation, not just rule out plausible causal relations? Or are we taking a Humean view of causation and saying there is no real force of causation at all, just a fixed regularity that humans imagine happens?
I think it requires an experimental study
i will like to confirm that is the equation of a line equals y=mx +b or y=mx+c
Does anyone know how to interpret a Bland-Altman analysis?
So no one's commenting how she's got a *puppycat plush toy* behind her?
Hmm. I may have needed this video 2 years ago when I was toiling in the halls of grad school
omg! PUPPYCAT 😭💗
Mr. Fluffy does not miss me.
Mr. Fluffy ran away...
Wow this is my doctor and his funny science