Bootstrap in Stata

Econometrics, Causality, and Coding with Dr. HK

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 4 гру 2024

КОМЕНТАРІ • 71

@I.amBago 8 місяців тому
First 3 minutes were exactly what I wanted to know. Thank you!
@aysecetinel 2 роки тому
This was super helpful!!! I first bootstrapped using bsample and then ran a multi-level fixed effect using reghdfe in the loop. It works great!
I noticed by default it sets the obs to be equal to the size of the dataset that you are sampling from. It also lets you oversample by setting an obs greater than the size of the dataset.
I also tried bootstrapping by using the command bootstrap, reps(#): then reghdfe. This by default lets you specify the obs number to sample as equal to the number of clusters in the dataset.
Thank you again for creating content and sharing! Looking forward to reading your book and hope that you'll have workshops tailored to grad students around the world.
@tarantula6649 Рік тому
Very helpful video! Thanks a lot!
@jessyjkn 4 роки тому
Omg you literally SAVED MY LIFE!!!!! Thank you Thank you Thank you!!!!!!
@lifehappy217 4 роки тому ⁺¹
Hi, Nick. Thank you so much for the nice video. I am doing panel regression, and wondering whether it is possible to use bootstrap to get the confidence intervals for the panel model using stata (or r).
@NickHuntingtonKlein 4 роки тому
Yep! That's a different goal than in this video though. See www.stata.com/support/faqs/statistics/bootstrap-with-panel-data/
@lifehappy217 4 роки тому
@@NickHuntingtonKlein Thank you so much. This is what I want to learn. It is helpful.
@pablovelazquez1903 6 років тому
Thank you for this clear explanation.
@aibannongspung1765 2 роки тому
Hi Nick .Thank you so much for the insightful video.I have a question to ask you .I am running a regression model and have also added weights to it ( eg I used aw= wt) since the survey data comes with a survey weight /multiplier.I need to bootstrap the model and report the standard errors thereafter.However I cannot use the same weights while bootstrapping . Is there a way around this issue? Will the standard errors generated without weights after bootstrapping be significantly different from the standard errors of the regression model with the weight ?
@NickHuntingtonKlein 2 роки тому
The downloadable package bsweights will help you do this
@aibannongspung1765 2 роки тому
@@NickHuntingtonKleinThank you for the reply .I just want to mention that the survey data that I am using does not have replicate weights. From what I understand, bsweights are helpful when the survey data also includes replicate weights. Can bsweights be used to manually generate these replicate weights for survey data without them ?
@NickHuntingtonKlein 2 роки тому
@@aibannongspung1765 oh I see. Maybe look at svy bootstrap. The replicate weights refer to the weights you get from bootstrap www.stata.com/manuals/svysvybootstrap.pdf
@aibannongspung1765 2 роки тому
@@NickHuntingtonKlein Thank you Nick .I will give it a try .
@gabriellanocita4239 4 роки тому
Thanks for the video! I'm wondering if bootstrapping can be used to run an MLM model with random effects predictors in Stata?
@NickHuntingtonKlein 4 роки тому
It sounds like you're looking for bootstrapped standard errors, which is something a bit different than this video is about. But yes you can apply bootstrap SEs to any model in Stata, see www.stata.com/features/overview/bootstrap-sampling-and-estimation/
@yasmindoghri9175 2 роки тому
Thank you very much for this video!! I was wondering if I could use bootstrap with different samples. To construct an index, I constructed an index merging data from a different dataset (I extracted mean values per variable from the latter one since it is way larger than my sample). I would like to check if the final index measurement is influenced by the external sample dimension. So as original dataset I considered my sample and after preserve I inputted the external dataset, whereas in the loop I put the distance index formula. Yet, once I run it, it says already preserved. what am I getting wrong?
@NickHuntingtonKlein 2 роки тому
There's a bit too much in here for me to follow it, but if you're getting an already-preserved erorr, that means that you tried to preserve twice in a row without a restore in between. So make sure each preserve is matched by a restore, or if you want to clear out your last preserve without restoring, use "restore, not"
@mikecheng6010 4 роки тому
Hi thank you so much Nick! If I wanna get the coefficient for each iteration, what should I do?
@NickHuntingtonKlein 4 роки тому ⁺¹
If you are running a regression in your bootstrap you can pull a coefficient out and store it in a local (just like in the code in the video). The way to refer to a coefficient after running the regression is with _b[x], where x is the name of the variable you want the coefficient for
@mikecheng6010 4 роки тому
@@NickHuntingtonKlein Got it thank you so much it works perfectly.
@kangkana1354 4 роки тому
Thank you so much Nick. I have a query on whether bootstrapping can be
used on a survey weighted data set, which uses a svy command before a
regression. If yes, how can the codes be modified?
@NickHuntingtonKlein 4 роки тому
If you're just trying to get bootstrapped SEs, look at the "svy bootstrap" help file
@kangkana1354 4 роки тому
@@NickHuntingtonKlein Thank you so much. I am going through the file currently to clear the basics.
@nandinimishra2149 2 роки тому
Nice Job Nick 💓💓💓💓
@nandinimishra2149 2 роки тому
May u share ur I'd for asking some problem related stata
@ProfessorAliAhmed 4 роки тому
I am using the stata KCDF function and then the variable generated from this into my regression model. Since my variable is estimated, I have to bootstrap the process. I am able to do the looping and bootstrapping based on your method, But I not able to use the generated bootstrapped variable in the model to get bootstrapped standard errors. any suggestions would be very helpful. Thank yo.
@NickHuntingtonKlein 4 роки тому
Just take the standard deviation of your bootstrapped coefficient (for example, with the summarize command). That's the bootstrap standard error.
@ProfessorAliAhmed 4 роки тому
@@NickHuntingtonKlein Thank you Nick!
@貴広今泉-l4e 3 роки тому
Thank you so much for your wonderful video! I just registered this channel as my favorite. Thanks. I'm wondering if I could use this in the regression command. In each loop, I opened the original dataset, ran the regression command and obtained the coefficient. Then I aggregated the results of each resampling. (I mean I calculated the mean and sd of the coefficient.) Am I right?
@NickHuntingtonKlein 3 роки тому ⁺¹
Yep, that works
@貴広今泉-l4e 3 роки тому
@@NickHuntingtonKlein Thanks! Your videos went viral in my community!
@貴広今泉-l4e 3 роки тому
@@NickHuntingtonKlein
By the way, in Stata software, the bootstrap command can also work but the coefficients do not change and only standard errors change. I could not understand why.
sysuse auto, clear
regress mpg weight gear foreign
regress mpg weight gear foreign, vce(bootstrap, rep(1000))
In the second command, you can get the coefficient and SE. But the coef is actually the same as the original model.
What is the difference?
@NickHuntingtonKlein 3 роки тому ⁺¹
@@貴広今泉-l4e The second command is estimating the coefficient by regular OLS and only the standard errors by bootstrap. This is actually a good idea if you plan to use them for hypothesis tests, as it helps any hypothesis tests done after the fact be sure they're comparing the right things.
@貴広今泉-l4e 3 роки тому
@@NickHuntingtonKlein Thank you very much! Got it! Now I understand the mechanism. Much appreciate it.
I am working on prediction model development and I wanted to learn how to perform internal validation using the bootstrap resampling method. I guess your program would work to calculate the optimism statistics to evaluate the prediction model based on the regression models. Aren't you going to make some video on this topic??
@ataliethompson6725 4 роки тому
How does one get a bootstrap 95CI and p-value for the difference in two proportions, particularly in multilevel data? I have dataset where eyes are nested within subjects. I want to show that the proportion of var1 is significantly different from the proportion of var2, and since the data is multilevel I'm assuming bootstrap 95CI and p value would be the way to address this?
@NickHuntingtonKlein 4 роки тому
For multilevel data you generally want to do bootstrap sampling by cluster. Once you do that, just store all the ratio estimates from all the bootstrap iterations. The 2.5th and 97.5th percentiles of the estimates are your confidence interval.
@ataliethompson6725 4 роки тому
@@NickHuntingtonKlein How does one bootstrap for the difference in two proportions (as opposed to a mean)?
@NickHuntingtonKlein 4 роки тому
@@ataliethompson6725 that's the beauty of bootstrap - just calculate whatever it is you want to calculate in each of the bootstrap samples. So calculate the difference in proportions
@evahakobjanyan8528 5 років тому
great video,I have question .I did exactly you show in video,but without g x normal,because I already had data. But error happens every time. ''invalid obs no'' what does it mean?
@NickHuntingtonKlein 5 років тому
The "set obs" command is for the purpose of creating the fake data, you don't need it if you already have data, and it will produce that error.
@evahakobjanyan8528 5 років тому
@@NickHuntingtonKlein do I need g store_means that you write before the word 'quietly'
@NickHuntingtonKlein 5 років тому
@@evahakobjanyan8528 You need some sort of variable to store the results in, yes.
@QuynhNguyen-ij6fe 4 роки тому
Can you guide using bootstrap with xtabond2? Thanks
@NickHuntingtonKlein 4 роки тому
For bootstrap SEs? I'm not certain that the bootstrap standard error assumptions are justified in the Arellano-Bond case. But in any case you should be able to apply the guide on this page about boostrapping in a panel/ts setting www.stata.com/support/faqs/statistics/bootstrap-with-panel-data/
@andreab2114 4 роки тому
What if I have missing values or a multiply imputed dataset ?
@NickHuntingtonKlein 4 роки тому
Missing values you just keep using as normal. For multiple imputation you could bootstrap each imputation separately. There might even be a special MI bootstrap in stata 16, I'm not sure, they added a bunch of MI stufd
@justalice5139 5 років тому
what if it shows ''floor not found''?
@NickHuntingtonKlein 5 років тому
That suggests there's an error in the line with floor in it. Remember, floor is a function, not a variable. So floor() is correct, not floor () or floor*()
@HE-gw2gr Рік тому
How to implement Kónya (2006) bootstrap panel granger causality approach in stata?please help me😢
@NickHuntingtonKlein Рік тому
No idea! Never heard of it. If it were me I'd Google for it.
@HE-gw2gr Рік тому
@@NickHuntingtonKlein Thank you.Of course I searched, unfortunately I couldn't find it.
@alisadavtyan2133 5 років тому
what command should I change if I already have exsiting varaible. thsi part g X=rnormal(4)*2+4
@NickHuntingtonKlein 5 років тому
Bootstrapping over an existing variable? It should all work the same, you can just skip generating a new variable and use the old one.
@alisadavtyan2133 5 років тому
@@NickHuntingtonKlein and what about set obs 10000 ?Should I write my obs number ?
@NickHuntingtonKlein 5 років тому
@@alisadavtyan2133 Everything before the "save originaldata.dta" line is just me creating the fake data, you don't need it. You can just open up your existing data instead.
@alisadavtyan2133 5 років тому
@@NickHuntingtonKlein and local boots are number of my obs ?
@NickHuntingtonKlein 5 років тому
@@alisadavtyan2133 That's the number of bootstrap iterations
@YorgosEU 5 років тому
I am doing a Cost effectiveness analysis for costs and health benefit. from my data I calculated an average cost and an average effect per treatment arm in order to calculate the ICER . Then my Supervisors told me that this is not enough and that I need to do bootstraping...i know how but... I DO NOT HAVE A CLUE WHY do I need to do this though. Does anyone know? THANKS!!
@NickHuntingtonKlein 5 років тому ⁺¹
I would recommend posting this question in more detail on StackExchange
@YorgosEU 5 років тому ⁺¹
@@NickHuntingtonKlein thanks Nick
@diverdown0011 7 років тому
Could you provide the do file. I keep getting an error
@NickHuntingtonKlein 7 років тому ⁺¹
Walter Chin I'm afraid I didn't keep the do file. It's just the same code you can see in the video though.
@diverdown0011 7 років тому
Thank for taking the time to reply. I figured it out. There was a minor issue in the code I entered.
The boot code is working.
Would you happen to know how this can be done for nested data? I have diving data with parameters of depths and bottom times (how long and how deep). These dives belong to a group of 17 small-scale fishermen divers. Each fishermen conducted a range of 100-400 dives per year.
My goal is get a good understand for what their average depth and bottom time. The dives are nested within each fishermen. The average per fishermen have a lot of variance.
Anyway any help is greatly appreciated.
@NickHuntingtonKlein 7 років тому ⁺¹
Walter Chin There are two ways to go about this depending on what you want to do with it. One uses the "strata" option of bsample, and the other uses the "cluster" option (see help bsample). Strata does a bootstrap such that you are resampling within fishermen (ie fisherman A did ten trips and B did 16, so you resample from A ten times and B 16 times). Cluster resamples at the fisherman level (ie it will resample from fisherman A and fisherman B, picking all the trips that fisherman goes on). If the problem is that there's a lot of noise within fishermen, you probably want the strata option, but I'd recommend looking closer at the help file for more details.
@ASMTowhid 6 років тому
Could you please help me? My code is not working. It's showing following error:
. set obs 'boots'
''' invalid
It is not an integer or its value is too large.
@hamaybe 8 місяців тому
@@ASMTowhid the first apostrophe should be a backtick (next to the one) i.e. `boots'; it is an annoying feature of specifying locals
@afshanyounas4495 4 роки тому
i am still confused....
@anusuyabiswas6687 4 роки тому
complicated and confusing... Better to use original data

Наступне

Автоматичне відтворення