An intuitive introduction to Difference-in-Differences

Doug McKee

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 23 жов 2024

КОМЕНТАРІ • 106

@SalehBabazadeh 9 років тому ⁺⁴⁸
Thank you so much Doug! I just wanted to encourage you for keeping up this great job. your videos are awesome and I believe , they are being used by different people in different field.
@dougmckee673 9 років тому ⁺⁵
+Saleh Babazadeh Thanks so much for the kind words! I really should post more of these!
@TaroQuispe 4 роки тому ⁺¹
@@dougmckee673 thanks from my side too, very clear and easy to understand. Do consider posting similar vids on regression techniques and similar, cheers!
@pedrocolangelo2458 3 роки тому ⁺¹⁴
This is probably one of the best videos on this subject that I've ever seen. Thanks!!
@anz-t1f 4 роки тому ⁺⁵
One of the best and most lucid explanations of the DID method. Thank you for this, Doug. Especially how you explain the intuition behind how the calculation of the DID estimate done by hand is same as that estimated by the regression model. And the part where you elaborate on the simple benefits of using a regression for a DID model, is great.
Really appreciate it that you having shared your understanding here.
@monicamu8013 4 роки тому ⁺²
When I watched the video for the first time, I was totally lost. During the second time, I took pauses in between to allow myself take more time to understand your super intelligent and super long sentences. It is so much clearer now. Thank you so much!
@sharonie 3 роки тому ⁺¹
Best Diff-in-diff course I have learned. Thanks!
@lawrencecobb2107 2 роки тому ⁺¹
This is such a clear and helpful video. I’m taking an exam in an hour and doing last minute double checks. This makes me feel more confident, thank you
@bl.l1506 4 роки тому ⁺¹
Your videos have been vital for understanding the contents of my statistics course for me! So far, I've supplemented every new concept with your videos. Sometimes, I even watch your video first and then do the readings. Please keep doing these videos!
@brothermalcolm 3 роки тому
Absolutely brilliant tutorial, first result returned, wish youtube was always this helpful!
@xb2856 Рік тому
way more intuative than previously thought, well put thanks
@zaraazami4936 9 років тому ⁺⁴
Thank you so much! This video was waaay much helpful than reading pages and pages on DD! Very clear and to the point! Thank you!!
@dianaadamczyk5273 6 років тому ⁺¹
Can't tell you how useful your videos are. Thanks for passing on the knowledge!
@kevinvandenbrink8214 9 років тому ⁺³
Thanks for the video, really helped me in my finance research. Just one thing when you talk about the dummy variable Dtr, I think it takes 1 if the person is in the treatment group and 0 if the person is the control group.
@dougmckee673 9 років тому ⁺¹
Kevin van den Brink You're exactly right--When (if) I re-record this I'll fix that. Thanks!
@Josefk40 8 років тому ⁺¹
Excellent explanation in 12 minutes. Thank you
@Itachi0567 4 роки тому ⁺¹
thanks a lot for this clear explanation, you dont know how much it helped me
@techierealestate 6 років тому ⁺¹
Clear and right to the point. I always wondered why the multiplication coefficient is the DD coeff, Now I know :D
@marben7062 8 років тому ⁺¹
Thank you very much Doug.
It helped me to analyse my data (pooled cross section).
@thefadingmoonlight 8 років тому
Thank you so much for uploading this! I had looked online at DID and was confused. This made it so easy to understand and apply.
@Non-disjunction 3 роки тому
You are such a legend mister McKee
@digray6732 2 роки тому
Thank you for this! I didn't quite understand the very last point, i.e. the difference between the points made for when DD is 'ok' (appropriate) and 'not ok'
@hd81504 8 років тому ⁺¹
First off, thanks for the great video, Doug! I have a follow-up question to one of the comments below:
One person commented:
So do I understand correctly an extension of the model for 3 treatment groups and 1 control with pre and post could look the following:
y = β0 + β1 * Dpost + β2 * Dtr1 + β3 * Dtr2 + β4 * Dtr3 + β5 * Dpost * Dtr1 + β6 * Dpost * Dtr2 + β7 * Dpost * Dtr3 + β8 * X
β5: DiD effect for Treatment 1
β6: DiD effect for Treatment 2
β7: DiD effect for Treatment 3
And you replied that is correct.
So my question is can you do this same procedure in logistic regression when your dependent variable is dichotomous (e.g., disease vs. no disease)?
@dougmckee673 8 років тому
Interpreting coefficients on interaction terms in nonlinear models (like logistic) is tricky. If it were me, I would just estimate a linear probability model, but there's a much longer (and better) answer here: stats.stackexchange.com/questions/89513/difference-in-differences-estimator-for-logistic-regressions
@lemoncobra2563 5 років тому
To respond to doug, I want to use a word of caution on using LPM is that you can have unbounded probabilities and your errors will be heteroskedastic. The latter can be fixed by an extra option but the former as a fundamental issue within the estimator itself.
I would argue the point of using DiD is to examine the magnitude of change from a program, etc and with a logit regression you will get your coefficients, calculate the margins, and use the margins to calculate a probability that the DD had on your dependent variable. You're kind of muddling the point of using a logit in this regard but it still works. Kind of loses some explanatory power and loses the charm. Still doable though.
@oldtree700 7 років тому ⁺¹
Hi, Doug! Thank you so much for your great video. I have a quick question. At the end of the video you mentioned the example for the case where DiD is not ok. If the free lunch program has been implemented already in the control group, is there anyway I can still use it as a control group? Semiparametric DiD can be used?
@sembilanbereguler2602 9 років тому ⁺⁴
Based on regression result (at 8:59), what is criteria to reject null hypothesis (to say that the effect of lunch program is statistically significant)?
@libbyalthea3061 8 років тому ⁺¹
Hello! Thank you for a great video! Do you any advice for estimating necessary sample size before implementing treatment? Thanks!
@josephdover6822 8 років тому ⁺¹
Hi Doug!
Thank you so much for your video
I just wanted to ask you a small question:
I am also planning to use the difference in differences model. I am looking at the impact of the EURO (introduiced in 1998 and in circulation in 2002) on trade flows between countries in Europe and I am new to STATA hence I am not too sure how to proceed.
I did the following regression
regress Tradeflow Governmenteffectiveness1 Unemployment1 GDPpercapita1 Populationsize1 Governmenteffectiveness2 Unemployment2 GDPpercapita2 Populationsize2 Distance1-2
But I am not sure what I should do next?
Any help would be very much appreciated! :)
Best,
Joseph
@dougmckee673 8 років тому
+joseph dover To apply a difference in difference, you'll need to divide your trade flows into some set that might be affected by the introduction of the Euro (treatment) and another set that definitely would not be (control). You will also need to reshape your data so you have observations of each trade flow before and after the Euro was introduced. Then you should be able to apply the regression method shown in the video. Good luck!
@yading9202 5 років тому
Very clear, easy to understand. Great job!
@emeraldwei6672 Рік тому
Thank you! I would like to know, if there isn't a comparable group, like Rio, then how can one figure out the effect of this programme?
@sarapluviano410 7 років тому
Hi, thanks for the video. In the beginning you say that DID is useful for estimating causal effects of programs when the program is not implemented as a randomized controlled trial. So, in a randomized controlled trial DID are not necessary? Thanks!
@linearseller2835 8 років тому ⁺¹
What a great video. I did miss conclusions about the example, though. Beta3 is 30, but it has a p-value equal to 0.228. Can we conclude that this free lunch plan didn't have a statistical relevance (at 95%), right? Those 30 points could have been by chance, right?
@dougmckee673 8 років тому
+Linear Seller Absolutely correct and not that surprising given there were only 10 observations in this sample.
@Run4un 7 місяців тому
In this EX, are y-scores the post-scores or the pre-post differences? I`m guessing just post scores? Thanks for clarifying!
@huekim589 3 роки тому
Very good and funny videos bring a great sense of entertainment!
@anglofranses8205 3 роки тому
This is pure gold. Thanks!
@inferno9004 8 років тому ⁺¹
IGreat video Doug !!!
if there is just have 1 treatment and control group with pre vs post time data and we want to include many control variables , say 5, how do we fit a model with 5 control variables ? What does the regression equation look like ?
@dougmckee673 8 років тому
+inferno9004 It looks just like the regression model shown in the video with the addition of your control variables.
@zeinebouni8764 8 років тому ⁺¹
Hi Mr Doug,
Thank you for this interesting Video.
Is it possible to do DID with ordinal Outcomes? My variables: Rating Firms (Y), D1 (D1== Treated simple; 0 Control Sample); D2 (D2==1 if after treatment; 0 Before).
I didn't found any examples to know if is it possible and to see how we can interprete the estimators.
Your response is very important for me.
Thank you.
@dougmckee673 8 років тому ⁺¹
+Zeineb Ouni I haven't seen it done, but you I believe you could estimate an ordered logit model (ologit) with the same covariates shown above (D1, D2, and D1*D2 in your case). You have to be careful with interpreting interactions in the ordered logit, but I think the basic idea is valid.
@zeinebouni8764 8 років тому ⁺¹
+Doug McKee Thank you so much.
@johndupont8596 8 років тому ⁺¹
Hi Doug
Thanks a lot for the video! I just have a question. I want to conduct a different in Differences module on STATA between students that received maths lessons and those that didn't . I would like to test when having extra maths lesson help student achieve higher marks.
My variables are: "StudentID" "TIME" "MATHS_LESSON" "MARKS"
But the problem I have is that not every students have received maths lessons over the period of time and I would like to create 2 groups one "maths_lesson" one "Nomaths_lesson" by adding them to the variable column "StudentID". How should I proceed?
Let me recap: I am now trying to obtain is a graph with "time" on the x axis and "marks" on the y axis with two line (one for the group of students who took maths classes and the one for the group that didn't) but I am struggling a bit to achieve this.
Hope I am clear in describing my problem!
Best regards,
John
@dougmckee673 8 років тому ⁺¹
+John Dupont Using your TIME variable, you should divide your observations into "before" and "after" groups. You've already divided your students into those that got the treatment (MATHS_LESSON) and those that didn't. Once you have that, you can compute means of the four cells and subtract them to get the DD estimate. I advise first understanding your data and computing the required numbers before worrying about communicating those numbers with a graph. Hope this helps!
@bright1402 5 років тому ⁺¹
Thank you for your video! But at the time 8:06, what is the difference between \beta_0 and \epsilon?
@jotaeleoh 5 років тому ⁺¹
Beta_0 is the effect or value of outcome "y" (not including the rest of the variables). Epsilon is the error term which basically contains all other components of "y".
@GoonieFridkin 8 років тому ⁺¹
Hi. Thanks so much for this! Quick question though. I've just run a DD regression on my data. The DD beta score isn't significant, but the group (test vs control) beta is. What does this mean?
@dougmckee673 8 років тому ⁺¹
The insignificant DD beta means there is no significant effect of the treatment. The significant group beta means you have significant pre-treatment differences between the groups.
@braddoremus588 7 років тому
Thank you - very good explanation. Helped clear a lot up for me.
@alfonsoga95 5 років тому ⁺¹
Thanks, I have one question though, what's the name of the program you're using for the regression? I'm not familiar with it, I find it quite practical
@oyvsni6679 5 років тому ⁺¹
Doug is using Stata
@VikramSingh-sf1ev 3 роки тому
Very clear to the point
@tarpinianmt 9 років тому ⁺¹
Thank you so much for this, I had never heard of difference in differences until a reading I had for economic development. I'm actually planning to reference this video in a paper; do you have anything you'd want me to include for a citation?
Thanks again.
@dougmckee673 9 років тому ⁺²
Matthew Tarpinian I'm really glad you've found the video helpful, but it's probably not appropriate for a citation in your paper. If you want a good reference for the method, I suggest using Angrist and Pischke's _Mostly Harmless Econometrics_ instead.
@tjahangon7286 9 років тому ⁺¹
Thank you very much. This video really helps me. What statistic program did you use in this video? Stata?
@dougmckee673 9 років тому ⁺¹
***** I did use Stata to get some of the numbers shown, but the content is fairly independent of the software in this video. Stata plays a bigger role in some of my other videos.
@tjahangon7286 9 років тому ⁺¹
Thank you very much.
@tjahangon7286 9 років тому ⁺²
Doug McKee May I ask one more question? I am using binary dependent variable (dummy). I have search information in internet and find that it is possible to have a regression model with binary dependent variable (in STATA: .probit and.logit command). In your opinion, can it be also implemented in regression of a DD model (I mean, using command .logit y DTr DPost DTrXDPost)?
@dougmckee673 9 років тому ⁺³
***** Short answer: Yes. Longer answer: If you use your binary dependent variable in a linear regression model exactly as shown here, you are estimating a linear probability model. The coefficients can be interpreted as effects on the probability of the dependent variable being one. Most economists would do this. You *could* estimate a logistic model with the same variables on the right hand side, but it is much harder to interpret the magnitude of the coefficient on the interaction.
@tjahangon7286 9 років тому ⁺¹
Doug McKee Do you mean that if y is a binary dependent variable and:
1. I use command [regress y DTr DPost DTrXDPost], then I am "estimating a linear probability model. The coefficients can be interpreted as effects on the probability of the dependent variable being one."
2. I use command [.logit y DTr DPost DTrXDPost], then "it is much harder to interpret the magnitude of the coefficient on the interaction."
I hope your answer is "yes".
@zhouchen7682 9 років тому
Very useful, wait for more.
@xingu7561 6 років тому
It is really helpful！This vedio is easy to understand for new learners like me！I really appreciate your help！If i can survive from my phd program，i hope i can make vedios like this in the future！
@wisuraweerathunga2188 4 роки тому
Thanks for this one ! You made it clear !
@shubrathak.p.7198 8 років тому ⁺¹
Hi Doug. Please help me! Can I use DID if my data does not follow the assumption of normality? If not..is there a non-parametric DID?!
@dougmckee673 8 років тому ⁺²
If you have a large enough number of observations (at *least* 25, and I'd feel comfortable over 100), then your outcome doesn't need to be normal--The Central Limit Theorem says your estimate of the treatment effect will be approximately normal.
I believe there are nonparametric DiD-like methods when you have a continuous treatment and you believe the effect is nonlinear, but I don't know much about them.
@shubrathak.p.7198 8 років тому ⁺¹
Thank you Doug!
@rheabanerjee4938 5 років тому
I wish you would post more, you're great!
@Non-disjunction 3 роки тому
Amazing video
@eiinre 7 років тому
Hi Doug, how do I add additional controls (i.e. X) into the model? I am using SPSS to do the DiD. Do I just add the control variable and regard it as an independent variable?
@vedantss 2 роки тому
Very useful!
@tuhinurrahmanchowdhury9705 3 роки тому
Great video. It saved me!
@hassanmurtzakhan 9 років тому ⁺¹
I am trying to run this through STATA and its omitted Beta3 because of multicolinearity between variables can you guide me how to handle it.
Thanks
@dougmckee673 9 років тому ⁺²
Hassan Murtza Khan I don't usually answer Stata questions on UA-cam, but I'll make an exception just this once. :) There are two possibilities. The first is that you don't have observations for each group (treatment and control) in both the before and after periods. Tabulate your treatment dummy and your control dummy and make sure all four cells have observations. The second possibility is that you made a mistake constructing the interaction variable. Check this by tabulating the interaction with each of the dummies to make sure the result makes sense.
Now your job is to try these and report back so everyone can learn!
@lauramendezcarvajal5149 8 років тому
Douglas thanks for this amazing video, it helped me so much! I just have a question:
why (y) has only one test score? I am a little bit confused about the pre-test and post-test information. If I have the test scores before the implementation and the scores after, how do I compute them? Thanks
@dougmckee673 8 років тому ⁺¹
They key is to have (or be able to compute) the average test score of both groups before AND after the intervention.
@sembilanbereguler2602 9 років тому
Based on regression result (at 8:59), what is criteria to reject null hypothesis?
@leopan54321 2 роки тому
Dude. This saved me thanks :)
@DavidLihm 8 років тому ⁺⁶
Thank you so much, this has been really useful!
@ec.juanfranulcuangolee3294 4 роки тому
Any impact evaluation it is supossed to be started #Building the #DataBase.. then the methodoly as DID must be analized..isn't???
@bright1402 5 років тому
Thank you so much for your video! But in the last slide, I could not understand the Not OK case...
@GradualReportSerbia 4 роки тому ⁺¹
Abrupt ending, good video
@vegasastras9194 3 роки тому ⁺¹
What is that program 8:17, looks very neat
@donasp5391 3 роки тому ⁺¹
Stata
@saraly2 2 роки тому
Thank you!
@thej1091 3 роки тому
Thank you kind sir! :)
@homayoungerami4176 4 роки тому
thanks, it was easy to digest
@Dniem 4 роки тому
Hello Professor Armstrong!
@fritzlouw8434 9 років тому
Much appreciated. Keep it up man!
@monicabraga4344 2 роки тому
how did you do it can you share with me , thank you
@JM-fr9bc 3 роки тому
What are the assumptions of dif in dif?
@Ytremz 8 років тому ⁺²
Brilliant
@rohangopalakrishnan7417 3 роки тому
Big from you Doug
@ahmedseliem3201 3 роки тому
how to do a difference in difference method using SPSS? need practical steps
@Nem3siS4o 7 років тому
Thanks!
@chocolateyum678 6 років тому
thank . you!!!!!!!!
@brucelee7782 5 років тому ⁺¹
I didnt get the did effect of 30 from 7:35 somebody help please! 😓
@liveybeha 4 роки тому ⁺¹
I didn't either at first! Remember to average (rather than add) each set of observations before doing the DiD calculation.
@matinhewing1 7 років тому ⁺¹
Who down voted this video? Someone who didn't get a free lunch?
@weoweoteo 6 років тому ⁺¹
lol! this vid was super helpful. especially for my econometrics exam tomorrow xd
@brothermalcolm 3 роки тому
everything made sense until @7:55 help!
@joaoluistbarroso6917 3 роки тому
Show
@sjhoenen 9 років тому
Thanks!
@ursulapulyer916 8 років тому
thank you!

Наступне

Автоматичне відтворення

An intuitive introduction to Propensity Score Matching