Fitting Models Is like Tetris: Crash Course Statistics #35

CrashCourse

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 12 січ 2025

КОМЕНТАРІ • 45

@OlleLindestad 6 років тому ⁺⁶¹
We got two examples of how to use ancova here, but no real explanation of what ancova actually *does*. :(
An ancova is a regression where you try to fit multiple lines at once. The covariate (e.g. the baby's age) is your X variable, and the Y variable is your response as usual (baby weight gain). In the simplest case, each group gets its own line [y=bx+a] with the same *slope* (b), but different *intercepts* (a1, a2, etc.); in this case, one intercept for each baby formula type. Lines that have the same slope but differ in intercept will appear parallel, but at different "heights" on the plot: this would mean, in this case, that age has the same overall effect on weight gain, but one formula gives a higher "baseline" weight gain at any given time point than the other.
So the p value for your covariate is an answer to the question "is there a significant effect, i.e. a nonzero slope, of X (age) on Y (weight gain)?", while the p value for your grouping variable is the answer to the question "is there a significant difference in intercept between the regression line for one type of formula and the regression line of the other type of formula?" In this case, there was a significant difference - and reorganizing the data according to an X variable (age) made us able to detect this difference, whereas when age was not considered, all data around each regression line was smushed together around a single mean, obscuring the effect of formula with a whole lot of noise. This is why the p value changed when age was added.
If an interaction term is included, it tests for a difference in *slope* between the groups. So an age-by-formula interaction might mean, for example, that weight gain is similar shortly after birth, but it makes a difference which formula you use later on in the baby's life. This would appear in the regression plot as two lines (representing the formula types) that are close together near X=0, but move apart farther to the right on the X-axis (as age increases) because one line is steeper than the other.
@stickykeys6288 6 років тому
Life Happens Feels bad lol!
@AdityaKrishn 4 роки тому ⁺³
Thank you for the explanation!
@AdityaKrishn 4 роки тому ⁺¹
Do you also understand repeated measures ANOVA and the example of the effects of music on running? Does she mean that we should normalize/standardize the values of running time with respect to each individual?
@acarbon-basedlifeform982 6 років тому ⁺¹⁵
If only I were addicted to stats in the same way
@OmarsHafsa 6 років тому ⁺¹⁹
So much Tetris content from the Green brothers this week! Love it!
@JCResDoc94 6 років тому ⁺⁵
2:34 i see this every time i recall sums of square variation from mean. so helpful. 👻🌩💜
@gigachad8338 6 років тому ⁺⁹
Are you going to cover multivariate analysis? I’d love to watch it 😍, as much as I love the way you explain every topic :)
@JCResDoc94 6 років тому
this is a gr8 ep. talk about application of moving parts finally. perfect. it is not just memorize these buttons in R
@nom-cha 6 років тому
Great content and timing! Have an RPM assignment dued tomorrow and always good to have a refresher
@jiayanchen7790 4 роки тому ⁺¹
Didn't expect to see ANCOVA in a stats intro series! Now I am expecting the multivariate analysis...
@ThatOneCoconut 6 років тому ⁺¹¹
I'm just saying, y'all should make a music theory series. :)
@TNnxnek 6 років тому ⁺⁴
My test is tomorrow morning!
@carlosfloresventuri6452 6 років тому
Good luck bro! :)
@Eeveee2 6 років тому ⁺¹
Boom! Tetris for Adriene!
@aSongScout 6 років тому ⁺¹⁰
Is it a coincidence or was it planned for this to come out the same week as the Classic Tetris World Championship? :D
@iefe65 6 років тому ⁺¹
B O O M ! T E T R I S 4 J E F F
@eliotcougar 6 років тому ⁺⁸
Will you cover the various means comparisons methods?.. Bonferroni, Tukey, Sidak, Scheffe, Fisher, Holm-Bonferroni, and Holm-Sidak...
@AlvaroALorite 4 роки тому
8:40 why isn't that done through a simple linear regression?
@andikajaka 6 років тому
How did the p-value change before and after you added "age in days" variable in 6:00 ?
@lukenewman9485 6 років тому
The F-test in ANOVA is in its most simple form the ratio of explained variance/unexplained variance. So if we omit age then the denominator is large. It becomes smaller when age is added to the model. Hence the value of the F statistic is higher and more likely to exceed the critical value. Hence the probability that we falsely reject the null when the null is true (p-value) smaller in the second model.
@AdityaKrishn 4 роки тому
Does anyone understand repeated measures ANOVA and the example of the effects of music on running? Does she mean that we should normalize/standardize the values of running time with respect to each individual?
@OlleLindestad 4 роки тому ⁺¹
Pretty much yes, using a method known as random effects. It's the same logic that you would apply, for example, if you've got a sample of twenty lab mice from five different mothers, and you want to account for the fact that mice sharing a parent might be more similar on average, meaning they don't quite constitute twenty independent measurements.
In this case, instead of grouping mice by mother, you group running time measurements by person. We say that you "include person in the model as a random effect".
@RUJedi 6 років тому ⁺¹
1) Why bother with an ANCOVA when you can run a regression with both variables? Redhead / not redhead is binary, so it will work just fine in a regression model. If you wanted to expand it to include redheads compared to other hair colors, we could just dummy code each category and drop one from the model as the reference variable.
2) What was up with the ~ in the error for the RMA after accounting for the base run time? Does ~ represent an approximate value, even though your decimal was wicked long? At first glance I thought it was a - (negative) instead of ~, but that wouldn't make sense. So...?
@OlleLindestad 6 років тому ⁺¹
A regression model with two explanatory variables, one continuous and one categorical or binary, is exactly what an ancova is. It works the same way whether the categorical variable has two or more possible values, so no dummy coding required.
No idea what's up with that tilde. I think it's either a mistake, or it's supposed to convey that it's a mixed model (so the error term is calculated in a somewhat different way).
@RUJedi 6 років тому
Ok then, if it is the same, what's the advantage of the ANCOVA option if we consider the outputs provided by various software? I can run a regression and get the F results table, but I get the added benefit of the regression coefficients table and all the fun stuff that comes with it, such as residual diagnostics. Sounds like running an ANCOVA would give a light touch analysis rather than taking a deeper look.
@OlleLindestad 6 років тому ⁺¹
Mathematically, the same thing is happening under the hood. If you've got a piece of statistical software that gives you different output when picking the "ancova" option and the "regression" option, that's because it's being selective about what part of the results to show you. But even if you just picked "regression", entered one categorical and one continuous explanatory variable... you just ran an ancova.
An ancova isn't *definitionally* more reductive than a regression analysis. It's just another name for this particular case of a linear model, much like the t-test is really just a linear model with a single binary x-variable. Point in case, if you're using R, the same function ("lm") can be used to run any of them, short of a GLM or a mixed-effects model.
@RUJedi 6 років тому
Life Happens Thanks for the insights!
@OlleLindestad 6 років тому
@@RUJedi Happy to help. :D
@diphyllum8180 6 років тому
What decades long argument? Nobody has ever argued that the best Tetris piece is anything other than the straight
@ambarrivera6049 4 роки тому
*laughs in T-shape*
@aryansolanki2221 6 років тому ⁺³
The 5 people who disliked this content
r *FLAT EARTHERS* 😒
@unleashingpotential-psycho9433 6 років тому ⁺³
Glad to hear Tetris has educational value.
@alisondacosta1249 6 років тому
Huhhhhhh&$26llp0pp0p9utyrdtfughjn,know nnnnj
Jjjjlk(
*jj*j*nj*just
@tosht2515 6 років тому
+Flaming Basketball Club He doesn't watch any of the videos before making a general statement. It's hilarious. 😂
@chickenstrangler3826 6 років тому ⁺³
I thought that you were talking about female models
@FrancoisBothaZA 6 років тому ⁺²
I'm a bit late to this crash course, but as someone who works with Excel (statistics or otherwise), the tables in this video look weird to me. I recommend right aligning all numbers, add thousands separators and also align so that the decimal symbols of numbers of the same column are aligned horizontally. It will greatly improve the readability.
@deoczidGONI 6 років тому
There's a terrible (60hz electrical?) background hum in this video :(
@willemvandebeek 6 років тому
Shouldn't it be Anocova?
@romapacana6157 6 років тому
Y
@zer0102 6 років тому
دەستان خۆش
@technofeeliak 6 років тому
When you speak this fast I feel like I'm watching an episode of "Gilmore Girls" and I'm not laughing.
@aNytmare 6 років тому ⁺¹
A Spoiler Alert for which Tetromino is best would have been nice. Now the whole game is ruined for me!
@djfresh5575 6 років тому
First
@sadiehartmann9644 6 років тому
3rd comment

Наступне

Автоматичне відтворення

Supervised Machine Learning: Crash Course Statistics #36