Stata - How to Estimate a Heckman Selection Model

Поділитися
Вставка
  • Опубліковано 7 сер 2024
  • Welcome to my classroom!
    This video is part of my Stata series. A series where I help you learn how to use Stata. In this video, we take a look at how to estimate a Heckman selection model.
    Note: What I show here is my take on the topic. I would be happy to receive comments!
    Useful links:
    ►Twitch: / steffens_classroom
    ►Twitter: / steff5001
    ►Workpage: www.rug.nl/staff/s.eriksen/
    ►Subscribe: cutt.ly/Qfu9cmV

КОМЕНТАРІ • 38

  • @ancestralfire
    @ancestralfire 22 дні тому +1

    I'm a PhD candidate in spain and was recommended this correction in one of my papers, thank you for explaining it in a simple way to understand better

    • @SteffensClassroom
      @SteffensClassroom  22 дні тому

      Happy that you found it useful. Good luck with your paper!

  • @yannishen6637
    @yannishen6637 Місяць тому

    Thank you very much! It's really useful!!

  • @danielaivanova3449
    @danielaivanova3449 2 місяці тому

    Best explanation so far . Thank you

  • @jurajsimcisko6416
    @jurajsimcisko6416 9 місяців тому +1

    Thanks for video

  • @user-my2ri6im5g
    @user-my2ri6im5g 4 місяці тому +1

    this has been extremely useful, thank you very much! The regression model I'm running is a multinomial logistic regression for the outcome model. If so, are all the steps same except the last one where it has to be specified mreg instead of reg? Would really appreciate any help.

    • @SteffensClassroom
      @SteffensClassroom  4 місяці тому

      I would not be certain as I have not done it myself before. However, it sounds reasonable at first glance.

  • @user-fg3os1px5n
    @user-fg3os1px5n 5 місяців тому +1

    Thank you for explaining this in detail! One question I have is that, in case of panel data, if I'm understanding correctly, imr will be computed differently for each time for the same individual. Then, do we treat imr as a state variable? or is imr a control variable to some unobservables? Thanks a lot!

    • @SteffensClassroom
      @SteffensClassroom  5 місяців тому +1

      Thank you for your question. I must admit that I have not studied the panel data case enough to give a good answer to your question. However, my intuition would tell me that IMR would be treated the same. That is, once generated after the first step, it is added as an independent variable in the second step.

  • @highfisch6590
    @highfisch6590 2 місяці тому +1

    Thanks Steffen, this was already super helpful! You mention at 9:08 that the exclusion restriction variables should not be strongly correlated with the IMR. However, i am working with a paper that mentions that there should not be a significant correlation between the exclusion restriction variable and the dependent variable of the second stage (the paper is Brauer, Wiersema and Binder 2023).
    In my case I find that my potential exclusion restriction variable is a significant in the probit model (so it would be a potential candidate), but I also find that it has a low, but significant correlation with the DV of the second stage (-0,1; p-value < 0,001).
    What is your opinion on this criterium? Is it still a viable candidate if the correlation with the IMR is low?

    • @SteffensClassroom
      @SteffensClassroom  2 місяці тому

      Hi! Thank you for your comment. From what I understand, at 9:08, I mention indeed that there should not be too high of a correlation between IMR and the exclusion restrions. You mention that there should not be a significant correlation between your exclusion restriction and the dependent variable in the second stage. I am a bit confused here. Your exclusion restrictions are only there in the first stage, and not 'directly' in the second stage.
      Given what you show me here, (-0,1; p-value < 0,001). I wouldn't be too worried.

  • @IkennaNnabue
    @IkennaNnabue Місяць тому

    Thank you very much. with this video, my first challenge is settled. Please my second challenge is multinomial endogenous switching regression. Do you know how to perform it in stata?

  • @robertneuhaus9381
    @robertneuhaus9381 5 місяців тому +1

    Hey thank you so much. That really clarified a lot for me. One question: At 8:59 you talk about a paper recommending using a correlation matrix for imr and the exclusive restriction variables. Could you provide the citation? Would be really helpful and thanks again

    • @SteffensClassroom
      @SteffensClassroom  5 місяців тому +2

      Here you go:
      Certo, S.T., Busenbark, J.R., Woo, H.S. and Semadeni, M., 2016. Sample selection bias and Heckman models in strategic management research. Strategic Management Journal, 37(13), pp.2639-2657.

    • @robertneuhaus9381
      @robertneuhaus9381 5 місяців тому +1

      Awesome! Thank you so much for the quick response@@SteffensClassroom

    • @robertneuhaus9381
      @robertneuhaus9381 5 місяців тому +1

      @@SteffensClassroom Hey Steffen. I read the paper and I think you might have made a mistake. The correlation should be tested between the indipendent variable and the Inverse Mills Ratio in order evaluate the quality of the exclusive restrictions. In your video you only check for the correlation between IMR and the Exclusion Restrictions. Please share your thoughts

    • @SteffensClassroom
      @SteffensClassroom  5 місяців тому

      Hi again! I hope you liked the paper. I think it is a really good piece. They talk about the correlation between IMR and x. For example in the Simulation condition section, they refer to reporting the correlation between IMR and x like in Bushway
      et al., 2007; Leung and Yu, 1996). Their x refers to teh exclusion restrictions. You can read this back in the Sample selection bias section on page 2643.
      But please also share on what page in their paper they refer to this. It is a rather long read :)

    • @robertneuhaus9381
      @robertneuhaus9381 5 місяців тому

      @@SteffensClassroom I thought it was a really interesting paper. Still i am just a Master student often struggling with these complex topics. On page 2649 they say:
      "Nevertheless, some scholars have proposed evaluating the strength of exclusion restrictions by examining the correlation between the inverse Mills ratio and the independent variable, x (Bushway etal., 2007;Leung and Yu, 1996; Moffitt, 1)"
      If they really mean that x is the exclusive restriction i at least find this sentence oddly phrased and a bit misleading. I would not have guessed that they refer to ER here.

  • @r.a217
    @r.a217 7 місяців тому +1

    Please, how do you test for the presence of sample selection bias using the lambda in MLE estimation?

    • @SteffensClassroom
      @SteffensClassroom  7 місяців тому

      Stated crudely: You basically check for the significance of your inverse mills ratio (lambda). That is it.

    • @r.a217
      @r.a217 7 місяців тому

      @@SteffensClassroom Yes, it is reported in the heckman two-step model but not in the heckman MLE. In the latter, ypu only get the lambda coefficient.

    • @r.a217
      @r.a217 7 місяців тому

      I did not opt for two-step procedure because of its strong assumption of homoscedasticity.

    • @SteffensClassroom
      @SteffensClassroom  7 місяців тому

      But you also get an associated standard error. That should give you everything you need to calculate it yourself. Remember your stats 101 course :)

    • @SteffensClassroom
      @SteffensClassroom  7 місяців тому +1

      Not the reasoning I would go for. Heteroscedasticity can be fixed.

  • @MrAbrahamdelpozo
    @MrAbrahamdelpozo 6 місяців тому +1

    Hi, what can I do if I have different datasets? one with wages and gender and other one with all the vars to calculate de Probability of being employeee. Idk how to merge it since there is no a common id var

    • @SteffensClassroom
      @SteffensClassroom  6 місяців тому

      Hi!
      You would have to create an id variable that links the observations. Otherwise, ... well...
      I suggest checking the merge video :)

    • @MrAbrahamdelpozo
      @MrAbrahamdelpozo 6 місяців тому

      Yep but the datasets are not the same, one has only actual employees, wages, etc and the other one also has non employees. I use the last one to run the probit and then the other one to see the wage differences

    • @SteffensClassroom
      @SteffensClassroom  6 місяців тому

      @@MrAbrahamdelpozo There should still be a way to merge this. Sounds like a 1:m merge. In any case, it seem slike you could link an employee's wage in one dataset to a set of other variables in the other dataset.

  • @RonakMaheshwari-ps8lo
    @RonakMaheshwari-ps8lo 28 днів тому

    Hi! What should i do in case my selection equation is a multonomial model?

    • @SteffensClassroom
      @SteffensClassroom  27 днів тому

      Not use a Heckman (:

    • @RonakMaheshwari-ps8lo
      @RonakMaheshwari-ps8lo 27 днів тому

      @@SteffensClassroom Can you suggest any alternatives?

    • @SteffensClassroom
      @SteffensClassroom  27 днів тому

      I am not sure what you want to accomplish? You need to think about what the goal is. You could also simply transform your selection variable into a dummy? Again, I do not know what you wish to accomplish here.