How to Identify and Treat Outliers in Stata | Stata Tutorial

Поділитися
Вставка
  • Опубліковано 1 лют 2025

КОМЕНТАРІ • 57

  • @Mat-mt8pk
    @Mat-mt8pk 4 роки тому +20

    Methods of finding outliers
    1:14 #1. Sorting
    2:52 #2. Box Plot
    6:04 #3. Extremes
    10:05 #4. Histogram
    10:50 #5. Spike Plot
    11:42 #6. Zscore
    Treatment
    13:07 #1. Keep outliers
    13:42 #2. Correct error
    14:23 #3. Winsorization
    19:06 #4. Trimming

  • @gonout8402
    @gonout8402 2 роки тому +3

    You have explained everything that my professor taught me in 2 months in just 20 minutes and it's is much more understandable and useful. Thank you very much

  • @rouniktalukdar872
    @rouniktalukdar872 2 роки тому +1

    Amongst the nicest video lecture that I have come across on this topic.. Thanks a lot. please keep uploading more contents on STATA.

  • @wilsonahinful5127
    @wilsonahinful5127 2 роки тому +1

    This is all that I have been looking for, thanks very much indeed

  • @addisugetahun1441
    @addisugetahun1441 3 роки тому

    Thank you for your nice and clear lecture in identifying and treating outliers.

  • @jibrilyero2263
    @jibrilyero2263 11 місяців тому +1

    Great job 🎉

  • @alphadie2012
    @alphadie2012 3 роки тому

    Clear and concise explanation. Thank you

  • @tomaxow
    @tomaxow 3 роки тому +1

    Really well done and explained

  • @danishjunaid1659
    @danishjunaid1659 3 роки тому +1

    Very well explained

  • @jemalhassen2841
    @jemalhassen2841 6 місяців тому

    It a very helpful video. Thank you!

  • @yilebesaddisu5314
    @yilebesaddisu5314 3 роки тому

    Thank you dear, very helpful!!

  • @shafiqullahyousafzai15
    @shafiqullahyousafzai15 3 роки тому

    Thanks from Afghanistan

  • @korneliuslanggason5477
    @korneliuslanggason5477 3 роки тому

    thank you for the explanation.

  • @isaacasante4060
    @isaacasante4060 2 роки тому +3

    Awesome video. Could you please do a similar one using panel data.

    • @thedatahall
      @thedatahall  2 роки тому

      Sure will make a video on that

  • @aibannongspung1765
    @aibannongspung1765 2 роки тому

    Thank you so much for this insightful video !! Suppose I want to trim the top and bottom 0.1 % of the distribution .How do I write the command ?

    • @thedatahall
      @thedatahall  2 роки тому

      I have never tried with decimals but the command will look like winsor2 variablename, trim cut(0.1 99.9)

    • @thedatahall
      @thedatahall  2 роки тому

      Let me know if it works

  • @lottet1945
    @lottet1945 3 роки тому +1

    Thank you for this clear explanation!
    Do you have a video on Cook's distance and Mahalanobis distance in Stata by any chance?

    • @thedatahall
      @thedatahall  3 роки тому +1

      Thanks for watching the video. Unfortunately i currently dont have video on this. I will see if in future i might add this. But if u r interested in spss then there are videos on UA-cam

  • @RafiaAli-n8e
    @RafiaAli-n8e Рік тому

    Hi, hope you are doing great. Can you share the link of multivariate outliers, I am not able to find it?

    • @thedatahall
      @thedatahall  Рік тому +1

      Thanks for your kind words. Unfortunately we haven't made any video on multivariate outliers. I will add that in my todo list

    • @RafiaAli-n8e
      @RafiaAli-n8e Рік тому

      It would be highly appreciated.@@thedatahall

  • @AhaNYS
    @AhaNYS 3 роки тому

    Thank you for the video! I have a question, I want to use ssc extremes among subcategories. How can I apply this extremes for every subcategory??

    • @thedatahall
      @thedatahall  3 роки тому +1

      U can try bys category: extremes etc etc

  • @shrinjoy1234
    @shrinjoy1234 3 роки тому

    How do we use winsor command if we want to replace outliers with Q3+1.5 IQR
    Can we use winsor command to handle outliers of multiple columns in one go? Please advise.

    • @thedatahall
      @thedatahall  3 роки тому

      it is not possible using winsor or winsor2 command. you will have to write code for it. one way is to create a variable that will store the value of Q3+1.5iqr and then u can use that to replace in your main variable

  • @atiyaabdulkarim716
    @atiyaabdulkarim716 3 роки тому

    A quick question, if we use sort function, will it allign all other observations in other variables? For eg. If we Sort by price, but we have other variables on age education and i.d. No.
    So after sorting by price, would it keep track of age and education with respect to i.d. after sorting or only one variable would be sorted not others, this can create problems, No?

    • @thedatahall
      @thedatahall  3 роки тому

      In stata the sort comment will keep tract of all variables and sort them simultaneously. The whole row will move and not the specific column of price.

    • @thedatahall
      @thedatahall  3 роки тому

      Sort only sorts in accending order, there is another command gsort -price so now it sort in descending

  • @badiahahmed2085
    @badiahahmed2085 4 роки тому

    Thank you for your great video. I have a question please, After using the Winsorization, can I take the logarithm for some variables? Thank you.

    • @thedatahall
      @thedatahall  4 роки тому +1

      Yes you can take log after winsorization. But be advised that after taking log the interpretation of coefficient changes to percent change. I am soon going to make a video on functional forms, so if u dont have the idea on interpretation after taking log then that video will help.

    • @badiahahmed2085
      @badiahahmed2085 4 роки тому

      @@thedatahall Thank you for your response, that will be great. MANY THANKS

  • @tranglephuong1896
    @tranglephuong1896 Рік тому

    Can you give me the dataset you run in video?

    • @thedatahall
      @thedatahall  Рік тому

      unfortunately i have misplaced the data and do file for this specific video.

  • @atiyaabdulkarim716
    @atiyaabdulkarim716 3 роки тому

    Can you tell us/take us through calculator functions in stata (syntax for exponent and complex function)

    • @thedatahall
      @thedatahall  3 роки тому

      Sure, u want me to make a video on arithmetic etc functions in stata?

    • @atiyaabdulkarim716
      @atiyaabdulkarim716 3 роки тому

      @@thedatahallthank you for getting back to me. I am a medical student and i have to use calculate function in stata to generate a new variable. My problem is that some components are used in exponent form, if you look at MDRD equation to define chronic kidney disease or CKD EPI equation, you will see serum creatinine levels, age are entered in the formula. My specific question is if i want to use this information from some variables in my data set, how can i do this. I tried exponent function but my calculations appear to be incorrect and it seems i am not following the right steps. I would highly appreciate if you could make a video or may be if you can give me a feedback.

    • @thedatahall
      @thedatahall  3 роки тому

      What command did u used, if u used exp() function then thats to invert log... If u email me the equation at info@thedatahall.com and might be some sample data or the command u have used i will look into it. If u wanted to take power e.g. square of a number then u do gen newvariable=oldvariable^2

    • @thedatahall
      @thedatahall  3 роки тому

      I searched for mdrd equation but i am not sure i found the right one

    • @atiyaabdulkarim716
      @atiyaabdulkarim716 3 роки тому

      @@thedatahall thank you for getting back to me, here is the link: patient.info/doctor/estimated-glomerular-filtration-rate-gfr-calculator
      Normal creatinine values range between 0.6 to 1.2 mg/dl...so one can use values at higher end or perhaps old age and see what is the filteration rate....

  • @alfinasintiya7477
    @alfinasintiya7477 3 роки тому

    saya tidak dapat menggunkan "extremes" adakah solusinya?

    • @thedatahall
      @thedatahall  3 роки тому

      i just used extreme, its working fine with me what error u are getting?
      saya hanya menggunakan "extremes", berfungsi dengan baik dengan apa ralat yang anda dapat?