Understanding missing data and missing values. 5 ways to deal with missing data using R programming

Поділитися
Вставка
  • Опубліковано 1 гру 2024

КОМЕНТАРІ • 57

  • @gregmartin
    @gregmartin  Рік тому

    Get my FREE cheat sheets for Public Health, Epidemiology, Research Methods and Statistics (including transcripts of these lessons) here: www.learnmore365.com/courses/public-health-epidemiology-research-methods-and-statistics-resource-library

  • @arifmemovic3383
    @arifmemovic3383 3 роки тому +3

    You have saved hundreds if not thousands of hours of beginning analysts time. Thanks!

  • @danmungai555
    @danmungai555 4 роки тому +22

    Hello sir, this is amazing. You're a wonderful teacher. Please do more. Very many thanks from me here in Kenya

    • @gregmartin
      @gregmartin  4 роки тому +4

      Thank you very much for the feedback. I’ve been to Kenya. Lovely country.

    • @danmungai555
      @danmungai555 4 роки тому +1

      I have been having problems with functions, can you help? I would appreciate so much

  • @chertify
    @chertify 2 роки тому +4

    I'm watching halfway. I just hit subscribe. The content you put here in this video is just so well-explained! You translate codes into layman's term and have "tidily" edited your video! I love the zoom in and out effect of it and the sound effect. Not too much. Just right. Not annoying, rather impressive. Thank you for sharing your knowledge to us, Greg!

    • @chertify
      @chertify 2 роки тому +1

      I just finished watching and taking down notes. Huge applause to you, Greg!!!

    • @gregmartin
      @gregmartin  2 роки тому +1

      What awesome feedback, thank you! I really appreciate it!

  • @rezzyraptor
    @rezzyraptor 2 роки тому +5

    I know this video is old, but still very helpful! I love your channel, you make stats and R fun :D Thanks for making these, keep up the great work.

    • @gregmartin
      @gregmartin  2 роки тому

      Glad you like them!

    • @sakanablesakanable
      @sakanablesakanable Рік тому

      Thank you so much Greg, can you please tell me what software you use for video editing?
      Thanks in advance

  • @asiathogmartin7725
    @asiathogmartin7725 4 роки тому +5

    You said, "Boom Shakalaka" LOL! Most awesome video ever.

  • @fernleaf1
    @fernleaf1 3 роки тому +2

    Great video. Looking forward to your videos about imputation and the MICE package. Keep’em coming!

  • @ostione
    @ostione 3 роки тому

    Best r tutorial , visuals, pace, delivery....so good!

  • @Junecode
    @Junecode 6 місяців тому

    Greg, thanks for ALL your elaborate videos and the structure of the lessons. In addition, the way you explain the code methodically! Love it. I was so stressed about replacing NA with none for the variable, gender (Pt 3 of handling missing values), turns out the variable is sex. Phew

  • @lilikoimahalo
    @lilikoimahalo 6 місяців тому

    This is a very insightful explanation:) thank you!

    • @gregmartin
      @gregmartin  5 місяців тому

      Glad you find it insightful. Thank you

  • @hazemshahin4166
    @hazemshahin4166 2 роки тому

    great way of yours to finally simplify stats ...thank you

  • @adrianfletcher8963
    @adrianfletcher8963 4 роки тому +3

    Please do a video on imputation in R! I was working on something and I was confused as to whether my data was "missing at random" or another option so I wasn't sure how to handle imputation.

  • @nour_hisham
    @nour_hisham 2 роки тому +1

    Thanks for the help, really appreciate, I have exam tomorrow, and you really helped Sir.😃❤

    • @gregmartin
      @gregmartin  2 роки тому

      Glad it was helpful! Thank you :)

  • @tuanlong9238
    @tuanlong9238 4 роки тому +3

    11:48 - " Take care, stay well, don't do drugs, always do best, speak to you soon. Bye! " - that's a cool outro

  • @topabschalala9900
    @topabschalala9900 Рік тому

    Impressed!

  • @xprownz
    @xprownz 4 роки тому

    Great video, helped me a lot cleaning some datasets in an easy way.

  • @dineshlakshitha1259
    @dineshlakshitha1259 3 роки тому

    supper video
    clear,
    thank you soo much

  • @ousmanelom6274
    @ousmanelom6274 4 роки тому

    You are a good teacher i like your video

  • @ruhafza4719
    @ruhafza4719 4 роки тому

    Waoo...another great video

  • @markelov
    @markelov Рік тому

    Could you please make a video on testing MCAR and, given its assumption of multivariate normality, talk specifically about what to do with factor variables or logicals?

  • @deadlyderp5856
    @deadlyderp5856 2 роки тому

    Hello Greg,
    I have a question. I ran the following code and i want to run a regression on the adjusted dataset now. However, it takes the unadjested dataset instead of the adjusted one. Also, it creates a new dataset called ''.'' (so just a dot). This dataset is the correct adjusted one, but I cannot even use it. I am confused.
    library(dplyr)
    library(ggplot2)
    library(tidyverse)
    iabbd_8010_v1%>%
    select(Destination, Year, Origin, Mstock_Total, Mstock_Low, Mstock_Med, Mstock_High, Distance, Democracy_origin, Democracy_destination, GDPpc_origin, GDPpc_destination, Language, Population_origin, Population_destination, Border)%>%
    mutate(Mstock_Total = replace(Mstock_Total, Mstock_Total == 0, NA))%>%
    drop_na(Mstock_Total)%>%
    mutate(Mstock_Total = log(Mstock_Total))%>%
    mutate(Population_origin = log(Population_origin))%>%
    mutate(Population_destination = log(Population_destination))%>%
    mutate(Distance = log(Distance))%>%
    mutate(GDPpc_destination = log(GDPpc_destination))%>%
    mutate(GDPpc_origin = log(GDPpc_origin))%>%
    View()
    reg1 = lm(Mstock_Total ~ Distance + Language + Border, data = iabbd_8010_v1) --> so actually i should do data = . but it says ''.'' doesn't exist
    summary(reg1)

  • @eridianestrada8923
    @eridianestrada8923 4 роки тому

    Hello, thank you for these videos. They are very helpful. Is there a video on what program evaluation is and how that looks in the global health context?

  • @sydbyd5040
    @sydbyd5040 2 роки тому

    🖐 great video, thanks. But didn't work for my case.
    There is a char format column, in my table (14 columns * 50000 rows) with up to 7000 missing values, but na.omit() can't find them.
    Is it possible it's due to invisible typed "space" that na.omit() can't find them?
    I hope I was clear.

  • @mugambwajonah8352
    @mugambwajonah8352 10 місяців тому

    I like the audio quality

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 3 роки тому

    Greg,
    I am having trouble seeing the difference between changing missing data to value vs imputation. Are they not the same? Can you explain the difference.
    Thanks!
    Great lessions by the way.

  • @mayankvermashubhamkaran4373
    @mayankvermashubhamkaran4373 2 роки тому

    Drop_na, complete.cases worked perfectly on R studio .
    But when I write the same code in kaggle new data frame doesn't have any value ??
    Any suggestions ??

  • @heartheart5543
    @heartheart5543 4 роки тому

    Dear Greg, I've been watching all you R video in your other channel " R Programming 101". Why didn't you put this R video in that channel?

  • @16kush
    @16kush 2 роки тому

    can we replace the NA without using library?

  • @rajiahdynsley2356
    @rajiahdynsley2356 3 роки тому

    Great vid but instead of using the "%>%" function, how could we have done it? Since we are not able to save these changes made to the original dataset using "%>%" function.

  • @violaz1141
    @violaz1141 9 місяців тому

    Why do you use pipe at the end of each command?

  • @michaelegbujua9369
    @michaelegbujua9369 Рік тому

    I want videos on text manipulation

  • @jamesparker7700
    @jamesparker7700 4 роки тому

    Hi Greg love your videos! Im a medical student who is going to intercalate next year in public health which im very excited about. Ive got a choice however between MSc International public health (with a focused stream on humanitarian studies) or MSc Humanitarian studies. Im interested in the working humanitarian relief space, but im wondering if I should I keep my studies a bit broader at the moment and study the MPH. Would be interested to know what you think in terms of if one would be more advantageous in my career. thanks James

  • @tsehayenegash8394
    @tsehayenegash8394 2 роки тому

    if you know please upload a video for the matlab code of Multivariate imputation by Chained Equation(MICE)

  • @AndrzejFLena
    @AndrzejFLena 3 роки тому

    Great introductory video! Thanks! :D
    I have a question for everyone: I'm imputing missing values for Gender in a dataframe. Out of the complete rows (no NAs) Male=61.89% and Female=the rest obviously. Is there a way I can impute the values randomly but in these proportions? It feels like there must be but I am new to R... Thanks!!

    • @brazzledazzle-o9w
      @brazzledazzle-o9w 2 роки тому

      im a bit late but i guess if you do a conditional on a random number generator. So 0-0.3811 is Female and 0.3812+ is male

  • @CanDoSo_org
    @CanDoSo_org 2 роки тому

    Na_if( ), it is just what I am looking for.

    • @gregmartin
      @gregmartin  2 роки тому +1

      Thank you for the feedback, Reddy. Hope all is well

  • @yininggao9990
    @yininggao9990 4 роки тому

    Why my latest R version shows that no tidyverse package 😫

  • @navicto
    @navicto 3 роки тому

    This video has useful information. However, it didn't help me understand missing data. It helped me understand how to filter out or replace missing values with a constant. Not the same.

  • @miscelleneoustubes
    @miscelleneoustubes 2 роки тому

    Sound system is very poor

  • @edisonwang1765
    @edisonwang1765 2 роки тому

    wa la

  • @CheeseCakes11944
    @CheeseCakes11944 Рік тому

    *what! "Don't do drugs"?? , youtube is one of the most addictive drugs in the world.