Stata - Keep/Drop and Missing values

Поділитися
Вставка
  • Опубліковано 3 гру 2024

КОМЕНТАРІ • 41

  • @royyanramdhanidjayusman7711
    @royyanramdhanidjayusman7711 3 роки тому +2

    Many thanks, Steffen, it is absolutely a clear explanation. 🙂

  • @francomolina457
    @francomolina457 3 роки тому +1

    Thank you so much, Steffen!

    • @SteffensClassroom
      @SteffensClassroom  3 роки тому

      Happy you liked it! Good luck with your Stata journey!

  • @qianwang3228
    @qianwang3228 4 роки тому +2

    Thank you, i have serached on how to drop under condition for a whole day!

    • @SteffensClassroom
      @SteffensClassroom  4 роки тому

      Glad I could help!
      If you are missing anything else, don't hesitate to ask!

    • @qianwang3228
      @qianwang3228 4 роки тому +1

      @@SteffensClassroom Thank you! I have a further question, if i use "drop if " to drop some observations, but then I want to use these obserations in the following regression, what should I do?

    • @SteffensClassroom
      @SteffensClassroom  4 роки тому +1

      Thank you for the question!
      The way to do this is as follows:
      (Use preserve and restore)
      What this does is that everything that happens after you write preserve and until you write restore, will be reset till whatever you had before you wrote preserve. Sounds strange? Let me give an example:
      preserve
      drop if obs=something
      reg y x
      restore
      reg y x
      Your regression inside the preserve/restore will be without the observations you dropped, and the second regression will be with the sample where you did not drop anything i.e. the sample you had before you typed preserve.
      Hope this helps!

    • @qianwang3228
      @qianwang3228 4 роки тому +1

      @@SteffensClassroom Thank you so much Steffen, it works. Really appreciate your help!

    • @SteffensClassroom
      @SteffensClassroom  4 роки тому

      Happy to help!
      Good luck :)
      Please share the videos. Hope this they will be a help to as many as possible!

  • @syedmaroofali6829
    @syedmaroofali6829 4 роки тому +2

    Excellent videos! Loving them so far! :) I was wondering that if I drop a variable, can I also undo it?

    • @SteffensClassroom
      @SteffensClassroom  4 роки тому +1

      No, not really.
      Unless you surrounded that part of the code with preserve/restore (See help preserve).
      However, you can just write this up in your do-file. If you figure out that you did not need to drop a certain variable, you can just adapt your do0file, re-run it, and you are back!

    • @syedmaroofali6829
      @syedmaroofali6829 4 роки тому

      @@SteffensClassroom Thank you for the response!

  • @lauragualdron3266
    @lauragualdron3266 3 роки тому +1

    Hi Steffen, thank you so much for your help! Is there a way I could drop all missing values from my dataset?

    • @SteffensClassroom
      @SteffensClassroom  3 роки тому

      First off, I am not sure you really want to do that. It is good to know that Stata removes observations with missing values in at least one variable that is included in your estimation automatically. So you don't have to do it for that reason. It is better to present all your data.
      However, if you really want to do it, you could do this: (if you don't have that many variables)
      keep if !missing(var1) & !missing(var2) & !missing(var3)
      or you can install the dropmiss command and write:
      dropmiss, obs any
      This is better if you have a larger dataset with many variables.
      I hope it helps!

  • @almaisaks
    @almaisaks 2 роки тому +1

    Terimakasih

  • @anonymousduckling3820
    @anonymousduckling3820 3 роки тому +1

    Hi Steffen! Thank you so much for youor videos they are extremely helpful. I was wondering if you could answer a quick question, I am using data from the World Bank Development Indicators and have no 'blank' values or '.' values only zeros which I beleive to be indicative of a missing value. As such, would the command to drop the variable be 'drop if Inflation==0'?

    • @SteffensClassroom
      @SteffensClassroom  3 роки тому

      Hi! Thank you for your question.
      Indeed, the command you suggest would work if the variable you try to drop is not a string. If it is a string, you would have to use " " around teh 0, such that the command would be: drop if Inflation=="0"
      Likewise, if it is blank or ".", then you can use drop if Inflation=="" and drop if Inflation=="." respectively.
      Let me know if this helps!

  • @TheEkhators
    @TheEkhators Рік тому

    Hi Steffen, thanks a lot for your video. How do I exclude a particular observation while running a regression in Stata? Say I want to regress wage on age, gender and experience but I want to use only data for those below a certain age, how do I go about it?

  • @FemkeHuisman
    @FemkeHuisman 3 роки тому +1

    So, if stata already automatically drops observations with missing values, should you not worry about them?

    • @SteffensClassroom
      @SteffensClassroom  3 роки тому +1

      There could be many reasons, some of which are highlighted here:
      shorturl.at/psAM6
      It is also important to think about why there are missing values, as there could be many reasons for this. Especially, if you have a panel. I discuss this a bit here (early in the lecture):
      I hope this helps!

  • @adinacska
    @adinacska Рік тому +1

    Can you create a new variable to contain the values dropped and kept?

    • @SteffensClassroom
      @SteffensClassroom  Рік тому

      Hi!
      Short answer; yes. You drop via a condition, then you can simply create a variable that is that condition. See the gen video :)

    • @adinacska
      @adinacska Рік тому

      @@SteffensClassroom thank you so much for the reply! I managed to figure it out! 👍👍👍

  • @ADashOfColour1
    @ADashOfColour1 3 роки тому +1

    Is there a way to conditionally drop variables when missing across a whole data set? Or do I have to do it variable by variable?

    • @SteffensClassroom
      @SteffensClassroom  3 роки тому

      Not sure what you mean. You want to drop a variable if it essentially empty? That is, contain not a single non-missing value?

    • @SteffensClassroom
      @SteffensClassroom  3 роки тому

      But you can drop variable that are completely empty with: missings dropvars, force
      You may need to install the missings command first: ssc install missings

    • @ADashOfColour1
      @ADashOfColour1 3 роки тому +1

      Hi thank you for replying! I have coded a bunch of values as missing across the data set (non-answered questions on surveys)and want to find a way to drop these missing points in one go. Currently I am using: drop var if ==. for each variable but was wondering if there was a more efficient way to do this? Thank you for your help! @@SteffensClassroom

    • @SteffensClassroom
      @SteffensClassroom  3 роки тому +1

      Not gonna lie. I don't think it is a great idea to drop all your missing observations. If you want to do it, then check here: www.stata.com/statalist/archive/2009-12/msg00524.html
      Good luck! :)

  • @Abrar_Ahmed05
    @Abrar_Ahmed05 2 роки тому +1

    but how to keep more than one observations in data ?

    • @SteffensClassroom
      @SteffensClassroom  2 роки тому

      Hello!
      You can use & to add more variables to keep/drop, or add more conditions to your command.

  • @sugandh9498
    @sugandh9498 Рік тому

    Hi Prof. My oil price data has missing values and I am trying to test it for structural breaks. But I am getting an error msg 'gaps not allowed' repeatedly. Is it due to missing values?

    • @SteffensClassroom
      @SteffensClassroom  Рік тому +1

      Hi!
      Indeed, when testing for structural breaks, you should have no missing values. Having missing values for oil prices seems strange, so you should be able to fill them out. Otherwise, you would have to change to a different data frequency.