As always, great episode! I really enjoy these more statistical videos in combination with data visualization. It would be nice to see those kinds of videos more often in the future!
Hey Pat! I’m wrapping up a master’s degree at UConn and wanted to thank you sincerely for your videos!! They have definitely saved me a lot of time and made me more proficient at data analysis/visualization. I still have a long way to go, but the growth from here will be exponential thanks to you. I hope you are well! I look forward to catching up with as much of your content as I can 🤓 you are awesome!!!
Awesome video. I’m in California where there isn’t snow and the longest range of data has varied very little in the last 100 years. That said I’ve used what I’ve learned through your channel to make some awesome visuals on my survey data (likhert scales)- thank you!
I have used nls and gsl_nl, I wonder how geomsmooth compares to these two other functions ( I guess they're basically the same thing..). Thank you very much. You are a great help. Learning R with you is fun.
Pat, Very interesting. I played around with the data some more following your lead. In general I see similar trends and my curves are very thight together. The 10:1 manual curve is in fact the steepest - the low-temperature curve (that is for temps in the range 0 to - 5C) is 8. The simple model is slightly less and the advanced model the lowest. However, they are quite tight. The 12:1 value that I quoted I saw somewhere in the climate literature but there are clearly local variations especially when close to these big lakes. The cursory work I did on this was not sufficient and as not elegant as your approach. I tried to sort my tmax data into bins (0 to -20 was my range) and then detect temperature groupings on the prcp-snow plot. I also tried to see if ggdensity could bring out these groups better than I could do by eye-balling the data. Again little luck. I did notice that the data I studied are dominate by values in the 0 to -5 C range so I decided to pull those out and treat them separately (low-temperature curve) . Perhaps the most revealing graph was the facet_wrap when I divided the temperature data into four groups (ranging from 0 to - 20C). The amount precipiation (as one might expect) decreases in terms of range as one went from 0 to -5 to -15-20C. There was a considerable shortening of the smooth line again showing that the "best" data is the lowest temperature data (where you have naturally more instances of snow and precip) in addition much higher amounts of both in mm. In none of my attempts did I see a significant effect from temperature on the fitted curves. I simply did not see if (even if it does exist). I have not touched on the precision of these measurements or tried to factor them in. Again thanks for these great lectures with real data.
Not sure I have a "good" reason. I've used it for a long time and have never used the base R pipe ... I think it's far more commonly used, which avoids cognitive load for learners ... I think it's better than the base R pipe because you can indicate the piped data going to an argument with a . - This article seems pretty solid on the issue www.infoworld.com/article/3621369/use-the-new-r-pipe-built-into-r-41.html
If i want to put a linear regression model in microbiome data between a number of OTUs relative abundance with BMI, Then what will be the formula for testing their significance? should i go for testing single OTU relative abundance with BMI? But we can plot it in a single graph by using facet_wrap but don't know how correct it will be?
It would be something like richness~bmi. you'd also want to make sure that richness and bmi are both normally distributed. I'd be pretty reluctant to use relative abundances since you will likely have a ton of zero values that will mess with the calculations
As always, great episode! I really enjoy these more statistical videos in combination with data visualization. It would be nice to see those kinds of videos more often in the future!
Thanks! I’ll do my best 🤓
Hey Pat!
I’m wrapping up a master’s degree at UConn and wanted to thank you sincerely for your videos!! They have definitely saved me a lot of time and made me more proficient at data analysis/visualization. I still have a long way to go, but the growth from here will be exponential thanks to you.
I hope you are well! I look forward to catching up with as much of your content as I can 🤓 you are awesome!!!
You’re too kind Levon - thanks for watching! 🤓
Awesome video. I’m in California where there isn’t snow and the longest range of data has varied very little in the last 100 years. That said I’ve used what I’ve learned through your channel to make some awesome visuals on my survey data (likhert scales)- thank you!
Awesome! You just gave me an idea for maybe making a visual to indicate droughtiness
Great video, thanks for the free resources you provide, having great fun learning this in my first year undergraduate
Wonderful! I love it when I see undergrads benefiting from my stuff. 🤓
I have used nls and gsl_nl, I wonder how geomsmooth compares to these two other functions ( I guess they're basically the same thing..). Thank you very much. You are a great help. Learning R with you is fun.
Pat,
Very interesting. I played around with the data some more following your lead. In general I see similar trends and my curves are very thight together. The 10:1 manual curve is in fact the steepest - the low-temperature curve (that is for temps in the range 0 to - 5C) is 8. The simple model is slightly less and the advanced model the lowest. However, they are quite tight. The 12:1 value that I quoted I saw somewhere in the climate literature but there are clearly local variations especially when close to these big lakes. The cursory work I did on this was not sufficient and as not elegant as your approach.
I tried to sort my tmax data into bins (0 to -20 was my range) and then detect temperature groupings on the prcp-snow plot. I also tried to see if ggdensity could bring out these groups better than I could do by eye-balling the data. Again little luck. I did notice that the data I studied are dominate by values in the 0 to -5 C range so I decided to pull those out and treat them separately (low-temperature curve) . Perhaps the most revealing graph was the facet_wrap when I divided the temperature data into four groups (ranging from 0 to - 20C). The amount precipiation (as one might expect) decreases in terms of range as one went from 0 to -5 to -15-20C. There was a considerable shortening of the smooth line again showing that the "best" data is the lowest temperature data (where you have naturally more instances of snow and precip) in addition much higher amounts of both in mm. In none of my attempts did I see a significant effect from temperature on the fitted curves. I simply did not see if (even if it does exist). I have not touched on the precision of these measurements or tried to factor them in. Again thanks for these great lectures with real data.
Thanks again for the video idea!
@@Riffomonas You are welcome. Glad to be able to help.
I've been struggling for days to turn the Geom smooth line black. thanks.
#BringPatBack
🤓
Forgive me if you've addressed this in a prior video, but is there a reason you still use the Magrittr pipe instead of the Base R pipe?
Not sure I have a "good" reason. I've used it for a long time and have never used the base R pipe ... I think it's far more commonly used, which avoids cognitive load for learners ... I think it's better than the base R pipe because you can indicate the piped data going to an argument with a . - This article seems pretty solid on the issue www.infoworld.com/article/3621369/use-the-new-r-pipe-built-into-r-41.html
If i want to put a linear regression model in microbiome data between a number of OTUs relative abundance with BMI, Then what will be the formula for testing their significance? should i go for testing single OTU relative abundance with BMI?
But we can plot it in a single graph by using facet_wrap but don't know how correct it will be?
It would be something like richness~bmi. you'd also want to make sure that richness and bmi are both normally distributed. I'd be pretty reluctant to use relative abundances since you will likely have a ton of zero values that will mess with the calculations
@@Riffomonas Yes Sir, that's the nice point relative abundance have tons of Zero Values
One more thing sir update R version from 4.2.0 - "Vigorous Calisthenics" to 4.2.1 - "Funny-Looking Kid"
It really shouldn't matter if the version number changes in the third position
@@Riffomonas Yes sir got it