Great work! A really good tool for illustrating the connection between the bootstrapped models' coefficient estimates and the original model's standard error terms.
Hello one question. Unfortunately I am still learning English, and these types of R videos interest me a lot. I can subtitle the video in English or Spanish (I use openai's whisper model to extract the subtitles), so, if I send you the srt or vtt file, could you add it to the video to be able to read the subtitles in Spanish and/or English?
Рік тому+2
I didn't know about augment(). For sure I'm going to use it. Thanks!
Hi Julia, is it possible to define and tune a long short-term memory (LSTM) model in "tidymodels"? I searched on the "tidymodels" website but did not find a proper solution. Thank you.
There is not support for specifying a model architecture like LSTM in tidymodels, but you can "mix and match" with tidymodels functionality and other packages. For example, look at this chapter of training LSTM models plus recipes plus yardstick and friends: smltar.com/dllstm.html
Hi Julia, thank you for the wonderful tutorial. Can I ask if it is possible to do a "model sensitivity analysis" in "tidymodels"? Generally, sensitivity analysis assesses how “sensitive” the model is to fluctuations in the parameters and data on which it is built.
I believe that the sensitivity analysis approaches I have seen out there would work for models trained using tidymodels. Most of the ones I've seen work for `lm()` models, and you can extract out that object from your tidymodels workflow or parsnip model.
This was fun!...I can't help but notice that you used the log10 transform for the output variable in your resampling models. When would you recommend we do that? I guess I learned the wrong lesson in the past--somehow I internalized the idea that transforming the output usually isnt required unless we're doing statistical tests.
It can be a quick way to normalize skewed data! Though it removes our ability to draw 1~1 inferences about the relationship b/w our dependent and indp variables (e.g., "genderMale coincides with a log10 unit-increase in area" isn't a very meaningful statement)
Julia you're amazing! what a superb, thorough, and clear explanation of the process. I'm sure this will be of great insight for many getting familiar with the versatility of resampling. Have you made any step-through examples using Random Forest to assess variable relationships? I've discussed with my peers the utility of applying RF when working with multiple variables before running a model, although a process that violates the use of a priori hypothesis but that sheds light when looking for unsuspected relationships or patterns.
If you mean something like feature importance with Boruta, no, I have not done any screencasts or posts with that. If you mean something more like variable importance from fitting a random forest, then you might be interested in this SO answer: stackoverflow.com/a/72680901/5468471
Love your videos Julia! I have a question which is a little more general and maybe its not related so much about this video. If we want to explore the effect of a numeric or categorical variable on a binary categorical outcome then what we should think of first to do ? Thank you again for your great work !
Hmmmm, I may not be understanding your question entirely, but I think I would just go through a normal modeling analysis, starting with EDA to explore such relationships and then building a model where I can estimate feature importance. You can see an example of this for data with a binary outcome and both numeric and categorical variables here: juliasilge.com/blog/sf-trees-random-tuning/
Great work! A really good tool for illustrating the connection between the bootstrapped models' coefficient estimates and the original model's standard error terms.
Hello one question. Unfortunately I am still learning English, and these types of R videos interest me a lot.
I can subtitle the video in English or Spanish (I use openai's whisper model to extract the subtitles), so, if I send you the srt or vtt file, could you add it to the video to be able to read the subtitles in Spanish and/or English?
I didn't know about augment(). For sure I'm going to use it. Thanks!
Hi Julia, is it possible to define and tune a long short-term memory (LSTM) model in "tidymodels"? I searched on the "tidymodels" website but did not find a proper solution. Thank you.
There is not support for specifying a model architecture like LSTM in tidymodels, but you can "mix and match" with tidymodels functionality and other packages. For example, look at this chapter of training LSTM models plus recipes plus yardstick and friends:
smltar.com/dllstm.html
@@JuliaSilge Thank you Julia.
Hi Julia, thank you for the wonderful tutorial. Can I ask if it is possible to do a "model sensitivity analysis" in "tidymodels"? Generally, sensitivity analysis assesses how “sensitive” the model is to fluctuations in the parameters and data on which it is built.
I believe that the sensitivity analysis approaches I have seen out there would work for models trained using tidymodels. Most of the ones I've seen work for `lm()` models, and you can extract out that object from your tidymodels workflow or parsnip model.
@@JuliaSilge Thank you.
This was fun!...I can't help but notice that you used the log10 transform for the output variable in your resampling models. When would you recommend we do that? I guess I learned the wrong lesson in the past--somehow I internalized the idea that transforming the output usually isnt required unless we're doing statistical tests.
You can read a bit more about how/why you might want to transform the outcome here: www.tmwr.org/ames.html#exploring-features-of-homes-in-ames
It can be a quick way to normalize skewed data! Though it removes our ability to draw 1~1 inferences about the relationship b/w our dependent and indp variables (e.g., "genderMale coincides with a log10 unit-increase in area" isn't a very meaningful statement)
This is awesome thanks so much for sharing
Thank you so much for sharing this awesome work!
Julia you're amazing! what a superb, thorough, and clear explanation of the process. I'm sure this will be of great insight for many getting familiar with the versatility of resampling. Have you made any step-through examples using Random Forest to assess variable relationships? I've discussed with my peers the utility of applying RF when working with multiple variables before running a model, although a process that violates the use of a priori hypothesis but that sheds light when looking for unsuspected relationships or patterns.
If you mean something like feature importance with Boruta, no, I have not done any screencasts or posts with that.
If you mean something more like variable importance from fitting a random forest, then you might be interested in this SO answer:
stackoverflow.com/a/72680901/5468471
Love your videos Julia! I have a question which is a little more general and maybe its not related so much about this video. If we want to explore the effect of a numeric or categorical variable on a binary categorical outcome then what we should think of first to do ? Thank you again for your great work !
Hmmmm, I may not be understanding your question entirely, but I think I would just go through a normal modeling analysis, starting with EDA to explore such relationships and then building a model where I can estimate feature importance. You can see an example of this for data with a binary outcome and both numeric and categorical variables here:
juliasilge.com/blog/sf-trees-random-tuning/
@@JuliaSilge Thank you very much !
Thanks for sharing
that was awesome ..,
Thank you!!!