Hi! Thank you for the question. In terms of applying the append command it should make no difference. In terms of other impacts on the dataset, I am not so sure.
Thanks! What about appending 2 or more datasets (longitutional data) where all other variables are the same in a questionaire but are spelled sligtly different? e.g. for 2023 = feeling23 and for 2022=feeling22. Is there a quick way to match about 200 variables
@@SteffensClassroom Hi thanks again for the quick answer. If they are not the same - in my case ch22o_EN_1.0p and ch23p_EN_1.0p would there also be a loop for that?
Hi! The most straightforward would be to add each additional year one by one. It ofc depends on how the data you wish to append looks like. If all the additional data is in one file, you can append everything in one go.
@@SteffensClassroom Great! If I have 10 years worth of data, what would code be after? This is what I have saved from your video and was able to get years 2013 and 2014, curious to know how to follow after with adding 2015,etc. : use "2013 Merged.dta" gen ID = _n gen year = 2013 order ID, first order year, after(ID) save 2013_merged, replace use "2014 Merged.dta" gen ID = _n gen year = 2014 order ID, first order year, after(ID) save 2014_merged, replace use 2013_merged, clear append using 2014_merged sort ID year
Ideally you could write a loop to make each year ready for the append, but if you want to do it step by step (not efficient but it works), then all you do is to replace the year with the year you have. After preparing each year, you can then add an append line for each year after: append using 2014_merged So for example, you would add this on the next line: append using 2015_merged A loop version of my code would look something like this: forvalues i = 1(1)2 { sysuse auto`1' gen ID = _n gen year = `i' order year, after(ID) save auto`i'_ready, replace } use auto1_ready, clear forvalues i = 2(1)2 { append using auto`i'_ready save auto_new, replace } In your code, the counter would start at 2013 :) You can also write this as a nested loop to make this more efficient, but this will work.
Thanks! However, all other variables now have the same values in the year after? Was that due to the data sets having the same information or do you need other STATA commands to prevent this from happening?
Hi, thank you for this. Can you do this for survey data with weights? How do you handle the weights across different survey years?
Hi! Thank you for the question. In terms of applying the append command it should make no difference. In terms of other impacts on the dataset, I am not so sure.
Thanks! What about appending 2 or more datasets (longitutional data) where all other variables are the same in a questionaire but are spelled sligtly different? e.g. for 2023 = feeling23 and for 2022=feeling22. Is there a quick way to match about 200 variables
Hi! If they have a similar stub as for example 'feeling' in your example, then you can do this in a loop.
The groups and loops video explains this.
@@SteffensClassroom Hi thanks again for the quick answer. If they are not the same - in my case ch22o_EN_1.0p and ch23p_EN_1.0p would there also be a loop for that?
The only time I encountered this, I simply went and changed the name of one of them (:
How do we go about it if we want to add in additional years? Do we have to clear the year prior?
Hi!
The most straightforward would be to add each additional year one by one. It ofc depends on how the data you wish to append looks like. If all the additional data is in one file, you can append everything in one go.
@@SteffensClassroom Great! If I have 10 years worth of data, what would code be after? This is what I have saved from your video and was able to get years 2013 and 2014, curious to know how to follow after with adding 2015,etc. :
use "2013 Merged.dta"
gen ID = _n
gen year = 2013
order ID, first
order year, after(ID)
save 2013_merged, replace
use "2014 Merged.dta"
gen ID = _n
gen year = 2014
order ID, first
order year, after(ID)
save 2014_merged, replace
use 2013_merged, clear
append using 2014_merged
sort ID year
Ideally you could write a loop to make each year ready for the append, but if you want to do it step by step (not efficient but it works), then all you do is to replace the year with the year you have. After preparing each year, you can then add an append line for each year after: append using 2014_merged
So for example, you would add this on the next line: append using 2015_merged
A loop version of my code would look something like this:
forvalues i = 1(1)2 {
sysuse auto`1'
gen ID = _n
gen year = `i'
order year, after(ID)
save auto`i'_ready, replace
}
use auto1_ready, clear
forvalues i = 2(1)2 {
append using auto`i'_ready
save auto_new, replace
}
In your code, the counter would start at 2013 :)
You can also write this as a nested loop to make this more efficient, but this will work.
Thanks! However, all other variables now have the same values in the year after? Was that due to the data sets having the same information or do you need other STATA commands to prevent this from happening?
This seems to be because of the same information being present. That would be my guess :)
Thank you!