I think Set Variable 2 (i.e PreviousModifiedDate) should have been inside If condition. Current file modified time should always compared with the highest modified time among previous files.
Amazing Video... Just wanted to cross check that I think set var (last mod date) should also come under IF activity. Then only it is working correctly in my case.
Hi Sir, I am really liking your series of videos & learning from it. I believe there is one issue in above implementation as the setVariable2 should also come under if true condition only along with setting variable 1. It's working in your case since your latest file is the last file that runs in foreach loop but if It won't be last then it will not copy that file. pls check.
Thanks for highlighting this issue, even I noticed while implementing the same that it wont work if your first file is latest file. @ Bandhu Gupta : Can you please suggest where we need to correct to implement the logic correctly?
@@sourabhgupta1428 you can put both variables PreviousModifiedDate and LatestFileName inside If-> true activity. Outside If activity you can add new variable and assign it to @variables('LatestFileName') - this will give u latest file name
Just move the 2nd set variable activity to IF - >TRUE activity before or after the 1st set variable activity and join them. No need to modify other things
Hi Maheer, I'm learning ADF watching your videos, it's amazing series, I just want to cross check, I think we need to use 2 set variables in true condition only, need to put previous modified date first then latest file name variable, then only it's working fine in my case, Thanks
hi sir thanks for such simple n very informative videos , Can you make one video on how we can resume failed copy activity from where it is failed not from the starting how we can achieve it ?
Thanks for this video! Can you also create a video to explain how can we verify sources and target tables? Like how can we verify all the rows and columns value got copied correctly using data factory
What if I have a date level hirearchy in a data lake gen 2 where I have folder strutcure for each table as /table1/2022/01/03.. /table1/2022/01/10 and files are present there, now how should I pick latest file in this case?
@wafastudies thanks for your explanation. But this solution is not scalable right with the increase in number of files the for loop has to check all the files everyday to get the last file everyday. Any scalable solution would you suggest?
Thanks for this video, can you please share we have input is excel daily basis files, we want latest file name with last modified date, how to implement ADF pipeline
We are processing files from SFTP location. but the issue is each time we upload a new file in sftp location and run the pipeline, it's processing already processed files along with the new file. As the number of files keep growing, it is becoming a problem. instead what we want is, once the file processed, we want that to move to an archive folder in SFTP location so thatonly latest file will be processedin next run. how to do this
Nice and clear explenation. But when i try this on files in different subfolders, the mechanisme doesn't work with wildcards for subdirectories. Do you have a solution for that?
Hi Maheer, Thaks for your efforts on doing this video's.its really very helpful.I am looking for similar scenario but instead of file need to get latest table records from SQL server..can you pls explain it how to get them.. Thanks
Thank you Maheer, the video is really very good. But this will not work in all the scenario to get the latest file from the folder. If there are 4 files in the folder and in that the first file is the latest file and the last file is the second latest file then it will pick the second latest file because of updating the Previous_Modified_Date variable with Last_modified_date in all the scenarios inside for for each loop.
Thanks for highlighting this issue, even I noticed while implementing the same that it wont work if your first file is latest file. @ birendra singh rawat : Can you please suggest where we need to correct to implement the logic correctly?
@@sourabhgupta1428 store the previous date only if the condition is true, like storing file name, that way you will always have latest date in the previous date variable, this should work
Hi Everyone, Quick Question: When I am uploading 2 files (A & B) at the same time, then it only copies file A but it does not copy file B. So can you please help me with logic on what to do if we upload 2 files at the same time?
Hi Maheer , This is not working when you have 2 or more files modified and it is considering the date of last file in your input folder but not based on last modified date, looks some logic is wrong in this video. I would please keep 7 to 8 files with different date values and the file should be having random date. Kindly have a look again and let me know if you want to share more details.
Instead of PreviousModifiedDate we have to store PreviousMaxModifiedDate using set variable in the if condition along with LatestFileName set variable.
Hi Maheer, I think this is not feasible solution because as the number of files will increase count for comparison will also increase. Suppose we have 50,000 Files then there will be 50,000 comparisons which will decrease the performance.
this is wrong i have test there are 4 files dates are below in the first iteration it make (previous date=last modified date) previous 01/01/199 10/2/2023 last modified date 2/2/2023 1/2/2023 5/2/2023 18/2/2023 in the first iteration 10/02/2023 is greater it do not go to second iteration bsc condition is not satisfied but actually latest file is 18/02/2023
I think Set Variable 2 (i.e PreviousModifiedDate) should have been inside If condition. Current file modified time should always compared with the highest modified time among previous files.
Amazing Video...
Just wanted to cross check that I think set var (last mod date) should also come under IF activity. Then only it is working correctly in my case.
I think you are right.. a good catch!
Is this like incremental loading
@@JitendraSingh-pg9sc ua-cam.com/video/sYM6kVpng28/v-deo.html
Yes It should come under if.
I am very much waiting for these kind of scenarios
Thanks Maheer
Welcome 🤗
Hi Sir, I am really liking your series of videos & learning from it. I believe there is one issue in above implementation as the setVariable2 should also come under if true condition only along with setting variable 1. It's working in your case since your latest file is the last file that runs in foreach loop but if It won't be last then it will not copy that file. pls check.
Thanks for highlighting this issue, even I noticed while implementing the same that it wont work if your first file is latest file. @
Bandhu Gupta
: Can you please suggest where we need to correct to implement the logic correctly?
@@sourabhgupta1428 you can put both variables PreviousModifiedDate and LatestFileName inside If-> true activity. Outside If activity you can add new variable and assign it to @variables('LatestFileName') - this will give u latest file name
Just move the 2nd set variable activity to IF - >TRUE activity before or after the 1st set variable activity and join them. No need to modify other things
Super Informative 👌
Very informative 😊 Thanks for sharing
Thank you Madam 😊
@@WafaStudies could you please share Your email id. I have some doubt.
Thank You So Much Sir always helping.
Welcome 😊
Hi Maheer, I'm learning ADF watching your videos, it's amazing series, I just want to cross check, I think we need to use 2 set variables in true condition only, need to put previous modified date first then latest file name variable, then only it's working fine in my case, Thanks
Wonderful tips! TKS you help a lot
Thank you ☺️
Thanks Maheeer I was looking for this video cheers!
Welcome 🤗
@@WafaStudies Thank you for your videos. Can you please create video on extract the file from Sharepoint location and load into sql table. Thanks
Hi Maheer, Cant we have both the 'Set variables activity' inside the 'if condition' true activity?
It should be
hi sir thanks for such simple n very informative videos , Can you make one video on how we can resume failed copy activity from where it is failed not from the starting how we can achieve it ?
Sure. I will plan
Thanks for this video! Can you also create a video to explain how can we verify sources and target tables? Like how can we verify all the rows and columns value got copied correctly using data factory
Hello sir,
Can you please tell me what if we are getting two files with same last modified date and time?
What can be done for this
Super helpful. Thank you very much.
Welcome 😀
What if I have a date level hirearchy in a data lake gen 2 where I have folder strutcure for each table as /table1/2022/01/03.. /table1/2022/01/10 and files are present there, now how should I pick latest file in this case?
@wafastudies thanks for your explanation. But this solution is not scalable right with the increase in number of files the for loop has to check all the files everyday to get the last file everyday. Any scalable solution would you suggest?
Sir , can we sort the list that we got in JSON of the file son lastmodified date in DESC order and get latest modified file.
Thanks for this video, can you please share we have input is excel daily basis files, we want latest file name with last modified date, how to implement ADF pipeline
We are processing files from SFTP location. but the issue is each time we upload a new file in sftp location and run the pipeline, it's processing already processed files along with the new file. As the number of files keep growing, it is becoming a problem. instead what we want is, once the file processed, we want that to move to an archive folder in SFTP location so thatonly latest file will be processedin next run. how to do this
Nice and clear explenation. But when i try this on files in different subfolders, the mechanisme doesn't work with wildcards for subdirectories. Do you have a solution for that?
Hi Maheer, Thaks for your efforts on doing this video's.its really very helpful.I am looking for similar scenario but instead of file need to get latest table records from SQL server..can you pls explain it how to get them..
Thanks
Thanks for sharing 👍
Welcome 😁
what file for the data set from very beginning? is it static csv contains all file name? cannot follow.
Thank you Maheer, the video is really very good. But this will not work in all the scenario to get the latest file from the folder. If there are 4 files in the folder and in that the first file is the latest file and the last file is the second latest file then it will pick the second latest file because of updating the Previous_Modified_Date variable with Last_modified_date in all the scenarios inside for for each loop.
Thanks for highlighting this issue, even I noticed while implementing the same that it wont work if your first file is latest file. @
birendra singh rawat : Can you please suggest where we need to correct to implement the logic correctly?
@@sourabhgupta1428 Try to run the for each loop without sequence it should give the correct output
@@sourabhgupta1428 store the previous date only if the condition is true, like storing file name, that way you will always have latest date in the previous date variable, this should work
HI Sir, could you please help me out this requirement. how to get oldest file from folder and process it in azure data factory
How fetch files from more than one folder and trigger respective pipeline or databricks notebook
Hi Everyone,
Quick Question: When I am uploading 2 files (A & B) at the same time, then it only copies file A but it does not copy file B. So can you please help me with logic on what to do if we upload 2 files at the same time?
It is asking to provide FileName in 1st Get Metadata activity, what to give there can someone help please
Is this like incremental loading
Hi Maheer , This is not working when you have 2 or more files modified and it is considering the date of last file in your input folder but not based on last modified date, looks some logic is wrong in this video. I would please keep 7 to 8 files with different date values and the file should be having random date. Kindly have a look again and let me know if you want to share more details.
This is what my doubt is , hi broo plzz give solution
Instead of PreviousModifiedDate we have to store PreviousMaxModifiedDate using set variable in the if condition along with LatestFileName set variable.
Hi did you got to know how to store multiple files using last modified date ?
Hi Wafa how to extract file from folder partition by date
Thanks for this video
I think solution would be to use notebook PySpark or python? Am i right?
it shouldn't be working as we expected if you had files like below
File1 8/10/2022 10:53:00:00
File2 8/10/2022 10:51:00:00
File3 8/10/2022 10:52:00:00
superb.........
Hi Maheer,
I think this is not feasible solution because as the number of files will increase count for comparison will also increase. Suppose we have 50,000 Files then there will be 50,000 comparisons which will decrease the performance.
Yes..i think better use notebook PySpark or python
This logic is good thanks for it but this isn't working, we need to set previous modified date inside if condition activity then only it will work.
Thanks
Welcome 🤗
@@WafaStudies Eid mubarak Sir
@@papachoudhary5482 Thank you 😊 Wish you same 😊
Hi @WafaStudies, can’t we get modified date in metadata1 itself??
No bcs your first getmetadata dataset is pointing to folder and if u try to get lastmodified then it will give u folder lastmodified info
@WafaStudies what if we have files in multiple sub folders, from the root folder?
👍🏻👍🏻
😁👍
If it is sql
this is not the correct way. check incremental data load using data factory in Microsoft documentation.
Nice video, but you need to speak slower.
Thank you. Sure. Thanks for feedback. I will work on it.
this is wrong i have test there are 4 files dates are below in the first iteration it make
(previous date=last modified date)
previous 01/01/199
10/2/2023 last modified date
2/2/2023
1/2/2023
5/2/2023
18/2/2023
in the first iteration 10/02/2023 is greater it do not go to second iteration
bsc condition is not satisfied but actually latest file is 18/02/2023
she has used assign value of previous value to latmodfied inside the if condition ua-cam.com/video/sYM6kVpng28/v-deo.html
which is best approach
Hi How to pick if we have a list of files with a date suffix like FIle_YYYYMMDD.csv
@greater(FormatDateTime(activity('Get Metadata2').output.Lastmodified,'yyyyMMddHHmmss'),FormatDateTime(variables('prevLastModifiedDate'),'yyyyMMddHHmmss'))