You have explained everything that my professor taught me in 2 months in just 20 minutes and it's is much more understandable and useful. Thank you very much
Thanks for watching the video. Unfortunately i currently dont have video on this. I will see if in future i might add this. But if u r interested in spss then there are videos on UA-cam
How do we use winsor command if we want to replace outliers with Q3+1.5 IQR Can we use winsor command to handle outliers of multiple columns in one go? Please advise.
it is not possible using winsor or winsor2 command. you will have to write code for it. one way is to create a variable that will store the value of Q3+1.5iqr and then u can use that to replace in your main variable
A quick question, if we use sort function, will it allign all other observations in other variables? For eg. If we Sort by price, but we have other variables on age education and i.d. No. So after sorting by price, would it keep track of age and education with respect to i.d. after sorting or only one variable would be sorted not others, this can create problems, No?
Yes you can take log after winsorization. But be advised that after taking log the interpretation of coefficient changes to percent change. I am soon going to make a video on functional forms, so if u dont have the idea on interpretation after taking log then that video will help.
@@thedatahallthank you for getting back to me. I am a medical student and i have to use calculate function in stata to generate a new variable. My problem is that some components are used in exponent form, if you look at MDRD equation to define chronic kidney disease or CKD EPI equation, you will see serum creatinine levels, age are entered in the formula. My specific question is if i want to use this information from some variables in my data set, how can i do this. I tried exponent function but my calculations appear to be incorrect and it seems i am not following the right steps. I would highly appreciate if you could make a video or may be if you can give me a feedback.
What command did u used, if u used exp() function then thats to invert log... If u email me the equation at info@thedatahall.com and might be some sample data or the command u have used i will look into it. If u wanted to take power e.g. square of a number then u do gen newvariable=oldvariable^2
@@thedatahall thank you for getting back to me, here is the link: patient.info/doctor/estimated-glomerular-filtration-rate-gfr-calculator Normal creatinine values range between 0.6 to 1.2 mg/dl...so one can use values at higher end or perhaps old age and see what is the filteration rate....
i just used extreme, its working fine with me what error u are getting? saya hanya menggunakan "extremes", berfungsi dengan baik dengan apa ralat yang anda dapat?
Methods of finding outliers
1:14 #1. Sorting
2:52 #2. Box Plot
6:04 #3. Extremes
10:05 #4. Histogram
10:50 #5. Spike Plot
11:42 #6. Zscore
Treatment
13:07 #1. Keep outliers
13:42 #2. Correct error
14:23 #3. Winsorization
19:06 #4. Trimming
Thanks for the efforts
You have explained everything that my professor taught me in 2 months in just 20 minutes and it's is much more understandable and useful. Thank you very much
😄
Amongst the nicest video lecture that I have come across on this topic.. Thanks a lot. please keep uploading more contents on STATA.
Thanks for the appreciation
This is all that I have been looking for, thanks very much indeed
Thank you for your nice and clear lecture in identifying and treating outliers.
Great job 🎉
Clear and concise explanation. Thank you
Really well done and explained
Thanks
Very well explained
Thanks
It a very helpful video. Thank you!
Thanks. Keep sharing
Thank you dear, very helpful!!
Thanks from Afghanistan
thank you for the explanation.
Thanks
Awesome video. Could you please do a similar one using panel data.
Sure will make a video on that
Thank you so much for this insightful video !! Suppose I want to trim the top and bottom 0.1 % of the distribution .How do I write the command ?
I have never tried with decimals but the command will look like winsor2 variablename, trim cut(0.1 99.9)
Let me know if it works
Thank you for this clear explanation!
Do you have a video on Cook's distance and Mahalanobis distance in Stata by any chance?
Thanks for watching the video. Unfortunately i currently dont have video on this. I will see if in future i might add this. But if u r interested in spss then there are videos on UA-cam
Hi, hope you are doing great. Can you share the link of multivariate outliers, I am not able to find it?
Thanks for your kind words. Unfortunately we haven't made any video on multivariate outliers. I will add that in my todo list
It would be highly appreciated.@@thedatahall
Thank you for the video! I have a question, I want to use ssc extremes among subcategories. How can I apply this extremes for every subcategory??
U can try bys category: extremes etc etc
How do we use winsor command if we want to replace outliers with Q3+1.5 IQR
Can we use winsor command to handle outliers of multiple columns in one go? Please advise.
it is not possible using winsor or winsor2 command. you will have to write code for it. one way is to create a variable that will store the value of Q3+1.5iqr and then u can use that to replace in your main variable
A quick question, if we use sort function, will it allign all other observations in other variables? For eg. If we Sort by price, but we have other variables on age education and i.d. No.
So after sorting by price, would it keep track of age and education with respect to i.d. after sorting or only one variable would be sorted not others, this can create problems, No?
In stata the sort comment will keep tract of all variables and sort them simultaneously. The whole row will move and not the specific column of price.
Sort only sorts in accending order, there is another command gsort -price so now it sort in descending
Thank you for your great video. I have a question please, After using the Winsorization, can I take the logarithm for some variables? Thank you.
Yes you can take log after winsorization. But be advised that after taking log the interpretation of coefficient changes to percent change. I am soon going to make a video on functional forms, so if u dont have the idea on interpretation after taking log then that video will help.
@@thedatahall Thank you for your response, that will be great. MANY THANKS
Can you give me the dataset you run in video?
unfortunately i have misplaced the data and do file for this specific video.
Can you tell us/take us through calculator functions in stata (syntax for exponent and complex function)
Sure, u want me to make a video on arithmetic etc functions in stata?
@@thedatahallthank you for getting back to me. I am a medical student and i have to use calculate function in stata to generate a new variable. My problem is that some components are used in exponent form, if you look at MDRD equation to define chronic kidney disease or CKD EPI equation, you will see serum creatinine levels, age are entered in the formula. My specific question is if i want to use this information from some variables in my data set, how can i do this. I tried exponent function but my calculations appear to be incorrect and it seems i am not following the right steps. I would highly appreciate if you could make a video or may be if you can give me a feedback.
What command did u used, if u used exp() function then thats to invert log... If u email me the equation at info@thedatahall.com and might be some sample data or the command u have used i will look into it. If u wanted to take power e.g. square of a number then u do gen newvariable=oldvariable^2
I searched for mdrd equation but i am not sure i found the right one
@@thedatahall thank you for getting back to me, here is the link: patient.info/doctor/estimated-glomerular-filtration-rate-gfr-calculator
Normal creatinine values range between 0.6 to 1.2 mg/dl...so one can use values at higher end or perhaps old age and see what is the filteration rate....
saya tidak dapat menggunkan "extremes" adakah solusinya?
i just used extreme, its working fine with me what error u are getting?
saya hanya menggunakan "extremes", berfungsi dengan baik dengan apa ralat yang anda dapat?