Automating Data Loading from Google Cloud Storage to BigQuery using Cloud Function and Dataflow

Поділитися
Вставка
  • Опубліковано 29 лис 2024

КОМЕНТАРІ • 45

  • @abhaybulbule6557
    @abhaybulbule6557 11 місяців тому +1

    Proud to see that among of us ( software engineer) achieving their goals by dedication and commitment
    Great congratulations whatever you achieved in your life

  • @Raju__p-v4d
    @Raju__p-v4d 8 місяців тому

    I have been looking for this content for a long time, thank god I found this, very useful

  • @alanguev
    @alanguev 7 місяців тому +1

    You're the best, brother. The information was really helpful, I appreciate it a lot. Greetings from Argentina, Buenos Aires.

    • @techtrapture
      @techtrapture  7 місяців тому

      Thank you brother ❤️🔥

  • @sampyedits3540
    @sampyedits3540 Місяць тому

    successfully completed this project! thanks!

  • @amritapattnaik3345
    @amritapattnaik3345 9 місяців тому +2

    i loved all your videos.Keep posting 😇🙂🙃

  • @hunterajones
    @hunterajones 8 місяців тому +1

    did the schema originally fail since the headers would not be an integer? With headers removed the original scheme would work right? Also, is there a way to automate header row removal? I need to daily auto add a CSV like this but it will always have a header row needing removed. Thanks for the video!!

    • @guptajipriyank
      @guptajipriyank Місяць тому

      Same question about header removal .. I need to add data daily.

  • @faroozrimaaz7092
    @faroozrimaaz7092 10 місяців тому

    Your videos are informative..keep going

  • @arerahul
    @arerahul 10 місяців тому

    Insightful video. Just a question - Cant we write the data load job in cloud functions, rather than using dataflow ? Also how do we create a delete job - data is deleted whenever the file is deleted from GCS

    • @techtrapture
      @techtrapture  10 місяців тому

      Yes we can write everything in python and put in cloud function or composer.
      2nd question - here you need to add something to identify which data is loaded by your file so your code can delete that data only.

  • @python_code08
    @python_code08 5 місяців тому +1

    Can we add this project in resume as a mini-project

  • @vinnakollurakesh8481
    @vinnakollurakesh8481 10 місяців тому

    Hi sir can you help me to pull the data from Kinaxis rapid response API to GCS, any related documentation or videos will be helpful, thanks

  • @noolusireesha205
    @noolusireesha205 3 місяці тому +2

    Sir, i have done the same process as you have menctioned in the video ....i'm getting error "java.lang.RuntimeException: Failed to serialize json to table row" . Could you please reply me with the solution.

    • @vignesh004
      @vignesh004 2 місяці тому

      even i'm getting the same error

  • @zzzmd11
    @zzzmd11 8 місяців тому

    Hi, Thanks for the great informative video. can you explain the flow if the data source is from a Rest API. Can we have a dataflow configured to extract from a Rest API to big query with dataflow without having cloud functions or Apache beam scripts involved? Thanks a lot in advance..

  • @nitinhbk
    @nitinhbk 7 місяців тому

    Could you please let me know what was the COST shown in GCP for this activity?

  • @nitinhbk
    @nitinhbk 7 місяців тому

    Thank you. Really helpful session.

  • @GURUSINGH-d1c
    @GURUSINGH-d1c 9 місяців тому

    Very good Video.
    Where can I get more cloud function templates ??

  • @sampyedits3540
    @sampyedits3540 Місяць тому

    i got something as csv format while creating data flow, i wrote default, but now there's no data in my tablet

  • @srikarfarmacy
    @srikarfarmacy 5 місяців тому

    thank you for the video, i have one doubt , if my csv file have header then do i need to have JSON code for schema?

    • @techtrapture
      @techtrapture  5 місяців тому

      Yes, dataflow job ask for mandatory JSON file

    • @srikarfarmacy
      @srikarfarmacy 5 місяців тому

      @@techtrapture Thank you for your prompt response. Could you provide a solution for this issue? Every day, my bucket is automatically uploaded with data that contains headers organized by date.

  • @mulshiwaters5312
    @mulshiwaters5312 5 місяців тому

    This is what exactly I need however instead of Trigger I would like to use Schedular with certain time interval like Daily Weekly . How can I achieve this.. Cloud COmposer ?Workflow ? Schedular

    • @techtrapture
      @techtrapture  5 місяців тому +1

      In scheduler you can use cronjob expression to mention date and time at which you need to trigger job

    • @mulshiwaters5312
      @mulshiwaters5312 5 місяців тому

      @@techtrapture Thanks Appreciate your help on this !

  • @subhashs5275
    @subhashs5275 6 місяців тому

    Which location was template path in Python file?

  • @ayush10_08
    @ayush10_08 11 місяців тому

    Hello sir,
    I watched your lot of videos related to function , dataflow
    I have one question
    As a GCP Data engineer who is responsible for writing code for dataflow or for data fusion?

    • @techtrapture
      @techtrapture  11 місяців тому

      Data fusion is a code free ETL tool. But as a general data engineer is responsible for writing all code for the data pipeline.

    • @ayush10_08
      @ayush10_08 11 місяців тому

      ​@@techtrapturemeans having a knowledge of only data related services is not important we have to learn coding ?

    • @techtrapture
      @techtrapture  11 місяців тому

      @@ayush10_08 yes for data engineer role we need coding

  • @earthlydope
    @earthlydope 6 місяців тому

    There's a catch here, we need to create BQ-Table-Schema and UDF.js file everytime before uploading a new flat file into the system.

  • @pramodasarath6733
    @pramodasarath6733 6 місяців тому

    Do we have to select csv file from storage to bigquery Or text file

  • @Makkar-b3v
    @Makkar-b3v 4 місяці тому

    You could do away with dataflow here. A simple python job using load_table_from_uri with auto schema detect enabled from trigger function would do this work.

    • @techtrapture
      @techtrapture  4 місяці тому

      Yes , single python would work definitely. This is to learn different services in GCP.

  • @SnehaNitishGCPAC
    @SnehaNitishGCPAC 11 місяців тому

    I am not able to find source code in github. Would you pls share the direct link for the same

    • @techtrapture
      @techtrapture  11 місяців тому

      Here is source code
      github.com/vishal-bulbule/automate-gcs-to-bq

  • @swarnavo9
    @swarnavo9 Місяць тому

    Where is the code buddy ? Could not get it from your Github :(

    • @techtrapture
      @techtrapture  Місяць тому

      github.com/vishal-bulbule/automate-gcs-to-bq

  • @joshhicks2444
    @joshhicks2444 4 місяці тому

    Subscribing!