I feel like ggplot2 is an easy go to for EDA in R. And now having watched the video I'd say view(), head( , ), tail( , ), summary(), and sum(is.na()) are a good place to start with skimr package skim() function being my favourite from this video.
ขอบคุณอาจารย์ที่ทำ content อย่างนี้ครับ your speciality in R with Bioinformatics is very helpful for my case as I am interning at Johns Hopkins University where they mainly use R programming. A lot of contents on youtube are in python and yours truly help as you have both. Greatly appreciated and thank you krup!
Waz, thanks krub for your kind comments! It is certainly a pleasure that you find the contents of this channel helpful. Earlier today I just released a new tutorial video on using Python for Computational Drug Discovery ua-cam.com/video/VXFFHHoE1wk/v-deo.html
When I try Method 2 using get URL, I get the error below. Any suggestions? Thx. Error in function (type, msg, asError = TRUE) : SSL certificate problem: certificate has expired
Hi, I have problems with function skim(): Error in base::nchar(wide_chars$test, type = "width") : lazy-load database '/Library/Frameworks/R.framework/Versions/3.6/Resources/library/cli/R/sysdata.rdb' is corrupt In addition: Warning messages: 1: In base::nchar(wide_chars$test, type = "width") : restarting interrupted promise evaluation 2: In base::nchar(wide_chars$test, type = "width") : internal error -3 in R_decompress1. ¿Could you advice me, please?
Prof. my skim(iris ) is not working even if after installing the package of skimr. It shows skimr package successfully unpacked but when i run the command it says skim not found. I run the code. library(dplyr) iris%>% dplyr::group_by(species)%>% skim() but the output is... iris%>% + dplyr::group_by(species)%>% + skim() Error in get(nm, envir = fn, mode = "function") : object 'skim' of mode 'function' was not found
I get this error when executing the dplyr function: Error in UseMethod("group_by_") : no applicable method for 'group_by_' applied to an object of class "factor"
Thank you data doctor for the detailed explanations. On my Rstudio i keep getting error messages when running this code iris %% dplyr::group_by(species) %% skim() - (Error in UseMethod("group_by") : no applicable method for 'group_by' applied to an object of class "factor")...please what am i doing wrong?
I am doing a masters degree in data science adn in my final year we have an elective of bioinformatics.Do I need to have an understanding of biology to study bioinformatics .I am from tech background and do not know much about biology except some high school knowledge
All you need to get started is high school biology, the rest you can read up on when needed. The hardest part of bioinformatics is the computational proficiency, although the biology is important at the mode interpretation phase but to get started computational proficiency can go a long way. Have fun exploring this exciting field, please check out the Bioinformatics playlist I’ve created at bit.ly/dataprofessor-bioinformatics
Hi Kisakye, thanks for the suggestion, I came across this book chapter that covers the topic, please have a look here link.springer.com/chapter/10.1007/978-3-030-28669-9_3
Thanks for your comment. Yes, exactly, more videos on R data science projects coming up. Next 2 videos will be to 1) visualize the iris data set and 2) building a classification model for predicting the class label. Please stay tuned. 😀
A sneak peak of what's to come is that I will eventually cover is how you can build a data-driven web app using R and shiny. An example of the web app that I've developed is codes.bio/osfp and further detail of the implementation is published at jcheminf.biomedcentral.com/articles/10.1186/s13321-016-0185-8
QUESTION OF THE DAY: How do you use R to perform "Exploratory Data Analysis"? What R functions or packages?
Well, there is package 'FactoMineR', pretty usefull.
I feel like ggplot2 is an easy go to for EDA in R. And now having watched the video I'd say view(), head( , ), tail( , ), summary(), and sum(is.na()) are a good place to start with skimr package skim() function being my favourite from this video.
We can also use "DataExplorer" package and can gain all the insights regarding data.
ขอบคุณอาจารย์ที่ทำ content อย่างนี้ครับ your speciality in R with Bioinformatics is very helpful for my case as I am interning at Johns Hopkins University where they mainly use R programming. A lot of contents on youtube are in python and yours truly help as you have both. Greatly appreciated and thank you krup!
Waz, thanks krub for your kind comments! It is certainly a pleasure that you find the contents of this channel helpful.
Earlier today I just released a new tutorial video on using Python for Computational Drug Discovery ua-cam.com/video/VXFFHHoE1wk/v-deo.html
What a legend man! Thank you for sharing your knowledge, going to follow this R project series - dropped a sub 😄
Awesome, thank you!
When I try Method 2 using get URL, I get the error below. Any suggestions? Thx.
Error in function (type, msg, asError = TRUE) :
SSL certificate problem: certificate has expired
Thank you for the video! I found it very intresting, especially skimr package.
Thanks Kalin for your comment. 😄
Thank you, professor, it is really helped me with my project
You are welcome!
Hi, I have problems with function skim(): Error in base::nchar(wide_chars$test, type = "width") :
lazy-load database '/Library/Frameworks/R.framework/Versions/3.6/Resources/library/cli/R/sysdata.rdb' is corrupt
In addition: Warning messages:
1: In base::nchar(wide_chars$test, type = "width") :
restarting interrupted promise evaluation
2: In base::nchar(wide_chars$test, type = "width") :
internal error -3 in R_decompress1.
¿Could you advice me, please?
Thank you. Your explanation is very clear
SP-Francina GOH Thank you so much for the kind words 😊
thanks for the explicit video,
You're welcome!
You know this job. Nice content.
Thanks!
Prof. my skim(iris ) is not working even if after installing the package of skimr. It shows skimr package successfully unpacked but when i run the command it says skim not found.
I run the code.
library(dplyr)
iris%>%
dplyr::group_by(species)%>%
skim()
but the output is...
iris%>%
+ dplyr::group_by(species)%>%
+ skim()
Error in get(nm, envir = fn, mode = "function") :
object 'skim' of mode 'function' was not found
I get this error when executing the dplyr function: Error in UseMethod("group_by_") :
no applicable method for 'group_by_' applied to an object of class "factor"
Thank you. Can we use the R -version 4.0 for this tutturials??
Hi, I haven't tested this in version 4 yet.
How can I download the dataset you used ???
Thank you! Your videos are super helpful :)
You're very welcome!
Hi, thanks for video. you mentioned at the beginning of video about a link to 6 steps... where is the link?
Thanks Siamak for pointing this out, I've added the link in the description. Links to videos in Data Science 101:
bit.ly/dataprofessor-ds101
Is the 101 video you're talking about supposed to be in the "R Data Science Project" library?
I don't see it :(
Thank you data doctor for the detailed explanations. On my Rstudio i keep getting error messages when running this code
iris %%
dplyr::group_by(species) %%
skim() - (Error in UseMethod("group_by") :
no applicable method for 'group_by' applied to an object of class "factor")...please what am i doing wrong?
use %>%
Great Video, Thanks
Thanks for watching Fahad!
Amazing video.. Thanks alot
Thanks Desmond for the kind words!
I am still struggling with my dataset, could we meet over zoom?
Hi, could you please share the link for the next video ? Thank you so much for this tutorial
I am doing a masters degree in data science adn in my final year we have an elective of bioinformatics.Do I need to have an understanding of biology to study bioinformatics .I am from tech background and do not know much about biology except some high school knowledge
All you need to get started is high school biology, the rest you can read up on when needed. The hardest part of bioinformatics is the computational proficiency, although the biology is important at the mode interpretation phase but to get started computational proficiency can go a long way. Have fun exploring this exciting field, please check out the Bioinformatics playlist I’ve created at bit.ly/dataprofessor-bioinformatics
hello sir i got an assignment can you please help me with the same
Could you kindly do a video about Extreme Value theory especially for Peak Over threshold and Annual Maximum
Hi Kisakye, thanks for the suggestion, I came across this book chapter that covers the topic, please have a look here link.springer.com/chapter/10.1007/978-3-030-28669-9_3
Data Professor thank you
Data Professor Is there one with R codes someone can follow through while using Rstudio
Hi professor thanks for this amazing content
Do you have any similar projects but using python ?
Yes, here it is the Python EDA video ua-cam.com/video/9m4n2xVzk9o/v-deo.html
Wish you do more R videos
Thank you
I will try
Great!
I think u can do more than that with R. However, Thanks
Thanks for your comment. Yes, exactly, more videos on R data science projects coming up. Next 2 videos will be to 1) visualize the iris data set and 2) building a classification model for predicting the class label. Please stay tuned. 😀
A sneak peak of what's to come is that I will eventually cover is how you can build a data-driven web app using R and shiny. An example of the web app that I've developed is codes.bio/osfp and further detail of the implementation is published at jcheminf.biomedcentral.com/articles/10.1186/s13321-016-0185-8
you can use a pistachio to open a pistachio
Too long video, thumbs down.