Predict giant pumpkin weights with tidymodels workflowsets

Julia Silge

245

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 7 лют 2025

КОМЕНТАРІ • 26

@N1loon 3 роки тому ⁺⁴
Thank you, Julia. Another super informative and educational video about different modelling techniques. As someone who is new to this field, I really gain so much from these videos. Keep it up!
@jamescrumpler3438 3 роки тому
Thank you so much for the help in learning how to use Tidymodels effectively
@mister_yog 3 роки тому ⁺¹
Thanks Julia, I really like your content :-) I didn't know about workflow_sets. They seem really helpful.
@brynhumberstone 3 роки тому ⁺³
Thanks for these excellent videos! I'm not sure if you know this but Cmd-Shift-M will insert a pipe symbol in RStudio (mentioning partly because of the %%> typo at 7:34). As an added bonus it will also insert |> if you tell RStudio that you want to use the native pipe.
@jonathanjayes 3 роки тому ⁺¹
This is a fantastic explainer, thank you so much Julia! Really appreciate the effort that you and the team are putting in to make {tidymodels} great to both learn and use. I have a request - not sure if this is the place for it so please excuse me if not - but I would love to see some tips on how to visualize the decision boundary on an SVM classifier or a KNN classifier in a future video. In your video about wind turbines from last year you showed the geom_parttree function to display the decision tree boundaries. A similar trick for other algorithms would be amazing. Thank you!
@JuliaSilge 3 роки тому ⁺²
If you're up for using a very "in development" package, you might check out what Emil Hvitfeldt has been playing around with here:
github.com/EmilHvitfeldt/horus/blob/master/R/viz_decision_boundary.R
@jonathanjayes 3 роки тому
@@JuliaSilge fantastic - thank you so much! I'll give it a go.
@j7andrew 2 роки тому
Love this!! Thank you
@gustavoantoniobrugesmorale1881 3 роки тому
NIce Julia. Very interesting
@PA_hunter 2 роки тому
I missed the tidymodels survey I think…what I would love is a DataRobot-like shiny gadget interface to use tidymodels! Perhaps a bit like the esquisse shiny gadget in user interface.
@JuliaSilge 2 роки тому ⁺²
If you are looking for ways to make it easier to generate tidymodels code, you might want to check out the parsnip RStudio addin:
www.tidyverse.org/blog/2021/03/tidymodels-2021-q1/#choose-parsnip-models-with-an-rstudio-addin
Or the usemodels package:
usemodels.tidymodels.org/
@jessicahoehner6715 3 роки тому ⁺¹
Thanks for this fall-flavored video! Since a plus of linear models is their interpretability, how would you interpret the spline terms?
@JuliaSilge 3 роки тому ⁺²
I don't know that I would/could interpret the spline terms individually, but instead in situations like that, I like to use visualization to understand how nonlinear additive components like that work. You can see some examples here:
stats.stackexchange.com/questions/503985/interpretation-of-cubic-spline-coefficients-in-r
stats.stackexchange.com/questions/465444/interpretation-of-coefficients-using-spline-ns-in-glm
If you want to get the fitted spline terms out of your recipe, you can extract and tidy it:
recipes.tidymodels.org/reference/tidy.recipe.html#examples
@PA_hunter 2 роки тому
Is there a way to check for linear reg assumptions within tidymodels? Like homoscedasticity, etc. just like using the plot() function of an lm() in base r. Thanks!
@JuliaSilge 2 роки тому ⁺²
Yes! You can use the `extract_fit_engine()` to get out the underlying lm object, and then call `plot()` on that to get the plot you are used to using.
@darmaw22 3 роки тому
Many thanks!
@Odwallaman10 3 роки тому
This is fantastic to follow! Do you have any other visualizations you would recommend on the pumpkin_rs output? Or other metrics to consider when measuring accuracy of predicting weight? Thank you for all your hard work!
@JuliaSilge 3 роки тому ⁺¹
I don't think you can make many other plots *directly* from the pumpkin_rs object (other than `autoplot()`) but you can explore and handle those columns in a flexible manner, depending on what you wan to do. Here is an example of doing that with workflowsets with another dataset:
workflowsets.tidymodels.org/articles/evaluating-different-predictor-sets.html
@Odwallaman10 3 роки тому
@@JuliaSilge Awesome -- thank you! I am hoping to visualize the tidy(final_fit) table based on the estimate values that you came up with to start. Can you elaborate a little more on the 'spline terms' that had the greatest impact on predicting the outcome? What does that mean in terms of the spline_recipe?
@JuliaSilge 3 роки тому
@@Odwallaman10 I usually like to use visualization to understand how nonlinear additive components like those spline terms work. You can see some examples here:
stats.stackexchange.com/questions/503985/interpretation-of-cubic-spline-coefficients-in-r
stats.stackexchange.com/questions/465444/interpretation-of-coefficients-using-spline-ns-in-glm
If you want to get the fitted spline terms out of your recipe, you can extract and tidy it:
recipes.tidymodels.org/reference/tidy.recipe.html#examples
@ammarparmr 3 роки тому
I've just watched the Scooby Doo video
Many thanks for your fantastic videos 🙏🙏
@michaelepstein8356 3 роки тому
Thanks very much, really clear example. Can I ask, where did you get the light on your wall?
@JuliaSilge 3 роки тому ⁺¹
HA sure, they are Nanoleaf lights and I got them at Costco.
@michaelepstein8356 3 роки тому
@@JuliaSilge I'll see if Costco stock them in the UK - cheers
@jeffrothschildmsrd5633 3 роки тому
Thanks for another great video Julia! I'm a little confused about your 'final fit' vs. using 'last fit'. I know last_fit does the final fit on the whole training data and then runs against the testing set. Is there a way to use last_fit but still be able use tidy() and examine the model parameters? Thanks
@JuliaSilge 3 роки тому ⁺³
In this case, you could definitely `extract_workflow()` to get the one you want, then `last_fit()` to both fit to the training data and evaluate on the testing data, then `extract_workflow()` again from that object to get the *fitted* workflow, and then `tidy()` that. Our hope/plan is that you have functions/verbs available to you to always be able to handle/extract each object in a modeling analysis that you want to.

Наступне

Автоматичне відтворення

Get started with random forest tuning and tidymodels using IKEA price data