Thanks for the video. There were a few things I did not know. btw, a slightly different approach: library(tidyverse) library(HistData) anscombe %>% select(contains("y")) %>% summarise_all(~sd(.x)) dictinary % pivot_longer(everything()) %>% extract(name, into = c("variable", "dataset"), regex = "(x|y)(\\d)", convert = TRUE) %>% left_join(dictinary, by = "dataset") %>% mutate( id = rep(1:11, each = 8), dataset = paste("dataset", dataset) ) %>% pivot_wider(names_from = variable) %>% group_by(description) %>% summarise(mean = mean(x))
Thanks for the additional code. I had to try it out right away. I was aware of the contains() function within select but haven't used much summarise_all yet, especially not with such an expression "~sd(.x)". It work quite well and I really like the dictionary and left_join approach. These things are really powerful and helpful and replace case_when write ups etc. I have to learn more about extract and regular expressions a bit more. Glad to have you as a follower and thanks again for the good comment. I hope others see it as well and learn from the code.
@@TheDataDigest glad to be here. ~ is the lambda function in R. I use `summarise_all(~mean(is.na(.x)))` to quickly find the parentage of all the missing values in all columns of a database. Or if I have a large table with 100 columns, where all the columns containing "pct" or "loc" should be divided by population column to get a relative value I use: mutate(across(contains(c("pct", "loc")), ~.x/population)) ; you can use this as well: mutate(across(where(is.numeric), ~.x/100)) to convert everything to proper percentages. example: iris %>% mutate(across(contains("Sepal"), ~.x/Petal.Width))
@@Zoyfad Thanks for clarifying the code even further. I think that would be a very useful video if I could collect time saving "hacks" like these that experienced R programmers like you found and use during their analysis. I will probably ask around on twitter as well and do some online research and then try to compile a top-10 list. :) Will see, so many topics one can cover....
Really Interesting And Helpful Tutorial..Thank You
Thanks for the comment, I am glad you liked it.
Again, a great video - thanks for all the hard work!
Glad you liked it and left a comment. It is hard, but also enjoyable work, especially if others appreciate it. So thanks again for the comment.
Thanks for the video. There were a few things I did not know.
btw, a slightly different approach:
library(tidyverse)
library(HistData)
anscombe %>%
select(contains("y")) %>%
summarise_all(~sd(.x))
dictinary %
pivot_longer(everything()) %>%
extract(name, into = c("variable", "dataset"), regex = "(x|y)(\\d)", convert = TRUE) %>%
left_join(dictinary, by = "dataset") %>%
mutate(
id = rep(1:11, each = 8),
dataset = paste("dataset", dataset)
) %>%
pivot_wider(names_from = variable) %>%
group_by(description) %>%
summarise(mean = mean(x))
Thanks for the additional code. I had to try it out right away. I was aware of the contains() function within select but haven't used much summarise_all yet, especially not with such an expression "~sd(.x)". It work quite well and I really like the dictionary and left_join approach. These things are really powerful and helpful and replace case_when write ups etc. I have to learn more about extract and regular expressions a bit more. Glad to have you as a follower and thanks again for the good comment. I hope others see it as well and learn from the code.
@@TheDataDigest glad to be here.
~ is the lambda function in R.
I use `summarise_all(~mean(is.na(.x)))` to quickly find the parentage of all the missing values in all columns of a database.
Or if I have a large table with 100 columns, where all the columns containing "pct" or "loc" should be divided by population column to get a relative value I use:
mutate(across(contains(c("pct", "loc")), ~.x/population)) ; you can use this as well: mutate(across(where(is.numeric), ~.x/100)) to convert everything to proper percentages.
example:
iris %>%
mutate(across(contains("Sepal"), ~.x/Petal.Width))
@@Zoyfad Thanks for clarifying the code even further. I think that would be a very useful video if I could collect time saving "hacks" like these that experienced R programmers like you found and use during their analysis. I will probably ask around on twitter as well and do some online research and then try to compile a top-10 list. :) Will see, so many topics one can cover....
Great Video
Great video, thanks a lot !
Thanks for leaving a comment. I am glad you liked it. 😀
Sir pls explain anscombe quartet in any video
Hi Shuchi, what do you mean? I thought the video above explained the Anscome quartet?