query on following case: 1)every couple of mins (ie.15 or 30 mins) ,multiple files ( i.e 20 to 30 files & sometime more) are loaded in Folder' 2)on each file loaded in folder ,it trigger cloud run (eventarc is set on that specific folder) 3)cloud run will be created and invoked on each file created in folder 4)cloud run read that file & do transformation as needed on data and store in one specific bigquery table Here question is 1)if there is multiple cloud run instance is created (concurrency is not 1), how it make sure that 2 cloud run instance do not works on same file as both will get event on each file creation (as per my understanding) 2)does it handle internally by cloud platform or need to write custom code to handle this case?
thank you for creating this practical use case. how does it know to not load same file again? where does that information get captured in GCS which keeps track of loaded file. so in the event want to reload it what to do?
Hello Anjan, could you please make a video of complete roadmap of GCP data engineer considering in mind for the Non-IT ,Tester guys, who want to get into GCP. Thank You.
Hi , sure pls wait for some more time and keep watching other videos , I have been thinking the same from past few weeks , but post Cloud function series I have planned composer series , if I get some time, I will try to to do that in between .
Hi Anjan, thank you for the video. I need urgent help... I have a usecase where I need to append only delta data of a json file from GCS to bigquery, its like if the file already exists in Bigquery and if I load same file with updated record or updated row then how can I just update the new entry and not overwrite the existing old entries.
great presentation, informative session tq Anjan
Thank you very much for giving clear understating on Google Cloud Function
Thank you so much for this, I have learned a lot from this video. Please post more.
Thank you very much for the video. It was exactly what I was looking for.
query on following case:
1)every couple of mins (ie.15 or 30 mins) ,multiple files ( i.e 20 to 30 files & sometime more) are loaded in Folder'
2)on each file loaded in folder ,it trigger cloud run (eventarc is set on that specific folder)
3)cloud run will be created and invoked on each file created in folder
4)cloud run read that file & do transformation as needed on data and store in one specific bigquery table
Here question is
1)if there is multiple cloud run instance is created (concurrency is not 1), how it make sure that 2 cloud run instance do not works on same file as both will get event on each file creation (as per my understanding)
2)does it handle internally by cloud platform or need to write custom code to handle this case?
thank you for creating this practical use case. how does it know to not load same file again? where does that information get captured in GCS which keeps track of loaded file. so in the event want to reload it what to do?
Please make a complete a series on GCP Data engineering.
Hello Anjan, could you please make a video of complete roadmap of GCP data engineer considering in mind for the Non-IT ,Tester guys, who want to get into GCP. Thank You.
Hi , sure pls wait for some more time and keep watching other videos , I have been thinking the same from past few weeks , but post Cloud function series I have planned composer series , if I get some time, I will try to to do that in between .
Hi, could you please provide the code for us to practise
Hi Anjan, thank you for the video. I need urgent help...
I have a usecase where I need to append only delta data of a json file from GCS to bigquery, its like if the file already exists in Bigquery and if I load same file with updated record or updated row then how can I just update the new entry and not overwrite the existing old entries.
How to do the a ove for 100 Gbs of data that is going to come in the buckets
what is this event and where it is coming from?
You can only store up to 5 files, the others are no longer stored as tables, but stored as records inside data_loading_metadata, o r am I wrong?
Can you make video on send and from Linux to pubsub . I mean how to send or receive message
what if file exist in some folder in gcs bucket
Could you please ladvise me or provide me reference on designing streaming pipeline from oracle to Bigquery?
Hi, could I do this for only a folder instead of the whole bucket?
Yes we can
Pls do share the code so that we can run and learn.
Pls check the video descriptions now you have th code
Nice video
Hi, Could you please provide your LinkedIn Id here . Your content is superb apart from other tutorials.