Thank you, Julia. Another super informative and educational video about different modelling techniques. As someone who is new to this field, I really gain so much from these videos. Keep it up!
Thanks for these excellent videos! I'm not sure if you know this but Cmd-Shift-M will insert a pipe symbol in RStudio (mentioning partly because of the %%> typo at 7:34). As an added bonus it will also insert |> if you tell RStudio that you want to use the native pipe.
This is a fantastic explainer, thank you so much Julia! Really appreciate the effort that you and the team are putting in to make {tidymodels} great to both learn and use. I have a request - not sure if this is the place for it so please excuse me if not - but I would love to see some tips on how to visualize the decision boundary on an SVM classifier or a KNN classifier in a future video. In your video about wind turbines from last year you showed the geom_parttree function to display the decision tree boundaries. A similar trick for other algorithms would be amazing. Thank you!
If you're up for using a very "in development" package, you might check out what Emil Hvitfeldt has been playing around with here: github.com/EmilHvitfeldt/horus/blob/master/R/viz_decision_boundary.R
I missed the tidymodels survey I think…what I would love is a DataRobot-like shiny gadget interface to use tidymodels! Perhaps a bit like the esquisse shiny gadget in user interface.
If you are looking for ways to make it easier to generate tidymodels code, you might want to check out the parsnip RStudio addin: www.tidyverse.org/blog/2021/03/tidymodels-2021-q1/#choose-parsnip-models-with-an-rstudio-addin Or the usemodels package: usemodels.tidymodels.org/
I don't know that I would/could interpret the spline terms individually, but instead in situations like that, I like to use visualization to understand how nonlinear additive components like that work. You can see some examples here: stats.stackexchange.com/questions/503985/interpretation-of-cubic-spline-coefficients-in-r stats.stackexchange.com/questions/465444/interpretation-of-coefficients-using-spline-ns-in-glm If you want to get the fitted spline terms out of your recipe, you can extract and tidy it: recipes.tidymodels.org/reference/tidy.recipe.html#examples
Is there a way to check for linear reg assumptions within tidymodels? Like homoscedasticity, etc. just like using the plot() function of an lm() in base r. Thanks!
This is fantastic to follow! Do you have any other visualizations you would recommend on the pumpkin_rs output? Or other metrics to consider when measuring accuracy of predicting weight? Thank you for all your hard work!
I don't think you can make many other plots *directly* from the pumpkin_rs object (other than `autoplot()`) but you can explore and handle those columns in a flexible manner, depending on what you wan to do. Here is an example of doing that with workflowsets with another dataset: workflowsets.tidymodels.org/articles/evaluating-different-predictor-sets.html
@@JuliaSilge Awesome -- thank you! I am hoping to visualize the tidy(final_fit) table based on the estimate values that you came up with to start. Can you elaborate a little more on the 'spline terms' that had the greatest impact on predicting the outcome? What does that mean in terms of the spline_recipe?
@@Odwallaman10 I usually like to use visualization to understand how nonlinear additive components like those spline terms work. You can see some examples here: stats.stackexchange.com/questions/503985/interpretation-of-cubic-spline-coefficients-in-r stats.stackexchange.com/questions/465444/interpretation-of-coefficients-using-spline-ns-in-glm If you want to get the fitted spline terms out of your recipe, you can extract and tidy it: recipes.tidymodels.org/reference/tidy.recipe.html#examples
Thanks for another great video Julia! I'm a little confused about your 'final fit' vs. using 'last fit'. I know last_fit does the final fit on the whole training data and then runs against the testing set. Is there a way to use last_fit but still be able use tidy() and examine the model parameters? Thanks
In this case, you could definitely `extract_workflow()` to get the one you want, then `last_fit()` to both fit to the training data and evaluate on the testing data, then `extract_workflow()` again from that object to get the *fitted* workflow, and then `tidy()` that. Our hope/plan is that you have functions/verbs available to you to always be able to handle/extract each object in a modeling analysis that you want to.
Thank you, Julia. Another super informative and educational video about different modelling techniques. As someone who is new to this field, I really gain so much from these videos. Keep it up!
Thank you so much for the help in learning how to use Tidymodels effectively
Thanks Julia, I really like your content :-) I didn't know about workflow_sets. They seem really helpful.
Thanks for these excellent videos! I'm not sure if you know this but Cmd-Shift-M will insert a pipe symbol in RStudio (mentioning partly because of the %%> typo at 7:34). As an added bonus it will also insert |> if you tell RStudio that you want to use the native pipe.
This is a fantastic explainer, thank you so much Julia! Really appreciate the effort that you and the team are putting in to make {tidymodels} great to both learn and use. I have a request - not sure if this is the place for it so please excuse me if not - but I would love to see some tips on how to visualize the decision boundary on an SVM classifier or a KNN classifier in a future video. In your video about wind turbines from last year you showed the geom_parttree function to display the decision tree boundaries. A similar trick for other algorithms would be amazing. Thank you!
If you're up for using a very "in development" package, you might check out what Emil Hvitfeldt has been playing around with here:
github.com/EmilHvitfeldt/horus/blob/master/R/viz_decision_boundary.R
@@JuliaSilge fantastic - thank you so much! I'll give it a go.
Love this!! Thank you
NIce Julia. Very interesting
I missed the tidymodels survey I think…what I would love is a DataRobot-like shiny gadget interface to use tidymodels! Perhaps a bit like the esquisse shiny gadget in user interface.
If you are looking for ways to make it easier to generate tidymodels code, you might want to check out the parsnip RStudio addin:
www.tidyverse.org/blog/2021/03/tidymodels-2021-q1/#choose-parsnip-models-with-an-rstudio-addin
Or the usemodels package:
usemodels.tidymodels.org/
Thanks for this fall-flavored video! Since a plus of linear models is their interpretability, how would you interpret the spline terms?
I don't know that I would/could interpret the spline terms individually, but instead in situations like that, I like to use visualization to understand how nonlinear additive components like that work. You can see some examples here:
stats.stackexchange.com/questions/503985/interpretation-of-cubic-spline-coefficients-in-r
stats.stackexchange.com/questions/465444/interpretation-of-coefficients-using-spline-ns-in-glm
If you want to get the fitted spline terms out of your recipe, you can extract and tidy it:
recipes.tidymodels.org/reference/tidy.recipe.html#examples
Is there a way to check for linear reg assumptions within tidymodels? Like homoscedasticity, etc. just like using the plot() function of an lm() in base r. Thanks!
Yes! You can use the `extract_fit_engine()` to get out the underlying lm object, and then call `plot()` on that to get the plot you are used to using.
Many thanks!
This is fantastic to follow! Do you have any other visualizations you would recommend on the pumpkin_rs output? Or other metrics to consider when measuring accuracy of predicting weight? Thank you for all your hard work!
I don't think you can make many other plots *directly* from the pumpkin_rs object (other than `autoplot()`) but you can explore and handle those columns in a flexible manner, depending on what you wan to do. Here is an example of doing that with workflowsets with another dataset:
workflowsets.tidymodels.org/articles/evaluating-different-predictor-sets.html
@@JuliaSilge Awesome -- thank you! I am hoping to visualize the tidy(final_fit) table based on the estimate values that you came up with to start. Can you elaborate a little more on the 'spline terms' that had the greatest impact on predicting the outcome? What does that mean in terms of the spline_recipe?
@@Odwallaman10 I usually like to use visualization to understand how nonlinear additive components like those spline terms work. You can see some examples here:
stats.stackexchange.com/questions/503985/interpretation-of-cubic-spline-coefficients-in-r
stats.stackexchange.com/questions/465444/interpretation-of-coefficients-using-spline-ns-in-glm
If you want to get the fitted spline terms out of your recipe, you can extract and tidy it:
recipes.tidymodels.org/reference/tidy.recipe.html#examples
I've just watched the Scooby Doo video
Many thanks for your fantastic videos 🙏🙏
Thanks very much, really clear example. Can I ask, where did you get the light on your wall?
HA sure, they are Nanoleaf lights and I got them at Costco.
@@JuliaSilge I'll see if Costco stock them in the UK - cheers
Thanks for another great video Julia! I'm a little confused about your 'final fit' vs. using 'last fit'. I know last_fit does the final fit on the whole training data and then runs against the testing set. Is there a way to use last_fit but still be able use tidy() and examine the model parameters? Thanks
In this case, you could definitely `extract_workflow()` to get the one you want, then `last_fit()` to both fit to the training data and evaluate on the testing data, then `extract_workflow()` again from that object to get the *fitted* workflow, and then `tidy()` that. Our hope/plan is that you have functions/verbs available to you to always be able to handle/extract each object in a modeling analysis that you want to.