Automating Data Loading from Google Cloud Storage to BigQuery using Cloud Function and Dataflow

Поділитися
Вставка
  • Опубліковано 25 гру 2023
  • Looking to get in touch?
    Drop me a line at vishal.bulbule@gmail.com, or schedule a meeting using the provided link topmate.io/vishal_bulbule Automating Data Loading from Google Cloud Storage to BigQuery
    In this video, I showcase the process of automating the transfer of data from Google Cloud Storage to BigQuery. The automation is achieved through the utilization of Cloud Functions, demonstrating step-by-step how this setup streamlines and orchestrates the data loading process seamlessly between these Google Cloud services.
    Associate Cloud Engineer -Complete Free Course
    • Associate Cloud Engine...
    Google Cloud Data Engineer Certification Course
    • Google Cloud Data Engi...
    Google Cloud Platform(GCP) Tutorials
    • Google Cloud Platform(...
    Generative AI
    • Generative AI
    Getting Started with Duet AI
    • Getting started with D...
    Google Cloud Projects
    • Google Cloud Projects
    Python For GCP
    • Python for GCP
    Terraform Tutorials
    • Terraform Associate C...
    Linkedin
    / vishal-bulbule
    Medium Blog
    / vishalbulbule
    Github
    Source Code
    github.com/vishal-bulbule
    Email - vishal.bulbule@techtrapture.com
    #gcs #googlecloud #cloudstorage #bigquery #datapipeline #automation #cloudfunction
  • Наука та технологія

КОМЕНТАРІ • 28

  • @abhaybulbule6557
    @abhaybulbule6557 5 місяців тому +1

    Proud to see that among of us ( software engineer) achieving their goals by dedication and commitment
    Great congratulations whatever you achieved in your life

  • @Ranjana_DE
    @Ranjana_DE 2 місяці тому

    I have been looking for this content for a long time, thank god I found this, very useful

  • @alanguev
    @alanguev 2 місяці тому +1

    You're the best, brother. The information was really helpful, I appreciate it a lot. Greetings from Argentina, Buenos Aires.

  • @faroozrimaaz7092
    @faroozrimaaz7092 5 місяців тому

    Your videos are informative..keep going

  • @nitinhbk
    @nitinhbk Місяць тому

    Thank you. Really helpful session.

  • @amritapattnaik3345
    @amritapattnaik3345 4 місяці тому +2

    i loved all your videos.Keep posting 😇🙂🙃

  • @hunterajones
    @hunterajones 3 місяці тому

    did the schema originally fail since the headers would not be an integer? With headers removed the original scheme would work right? Also, is there a way to automate header row removal? I need to daily auto add a CSV like this but it will always have a header row needing removed. Thanks for the video!!

  • @earthlydope
    @earthlydope 20 днів тому

    There's a catch here, we need to create BQ-Table-Schema and UDF.js file everytime before uploading a new flat file into the system.

  • @nitinhbk
    @nitinhbk Місяць тому

    Could you please let me know what was the COST shown in GCP for this activity?

  • @vinnakollurakesh8481
    @vinnakollurakesh8481 4 місяці тому

    Hi sir can you help me to pull the data from Kinaxis rapid response API to GCS, any related documentation or videos will be helpful, thanks

  • @zzzmd11
    @zzzmd11 2 місяці тому

    Hi, Thanks for the great informative video. can you explain the flow if the data source is from a Rest API. Can we have a dataflow configured to extract from a Rest API to big query with dataflow without having cloud functions or Apache beam scripts involved? Thanks a lot in advance..

  • @user-zj3yx8rk3u
    @user-zj3yx8rk3u 4 місяці тому

    Very good Video.
    Where can I get more cloud function templates ??

  • @subhashs5275
    @subhashs5275 24 дні тому

    Which location was template path in Python file?

  • @arerahul
    @arerahul 4 місяці тому

    Insightful video. Just a question - Cant we write the data load job in cloud functions, rather than using dataflow ? Also how do we create a delete job - data is deleted whenever the file is deleted from GCS

    • @techtrapture
      @techtrapture  4 місяці тому

      Yes we can write everything in python and put in cloud function or composer.
      2nd question - here you need to add something to identify which data is loaded by your file so your code can delete that data only.

  • @mulshiwaters5312
    @mulshiwaters5312 8 днів тому

    This is what exactly I need however instead of Trigger I would like to use Schedular with certain time interval like Daily Weekly . How can I achieve this.. Cloud COmposer ?Workflow ? Schedular

    • @techtrapture
      @techtrapture  8 днів тому

      In scheduler you can use cronjob expression to mention date and time at which you need to trigger job

  • @pramodasarath6733
    @pramodasarath6733 22 дні тому

    Do we have to select csv file from storage to bigquery Or text file

  • @ayushnaphade9419
    @ayushnaphade9419 5 місяців тому

    Hello sir,
    I watched your lot of videos related to function , dataflow
    I have one question
    As a GCP Data engineer who is responsible for writing code for dataflow or for data fusion?

    • @techtrapture
      @techtrapture  5 місяців тому

      Data fusion is a code free ETL tool. But as a general data engineer is responsible for writing all code for the data pipeline.

    • @ayushnaphade9419
      @ayushnaphade9419 5 місяців тому

      ​@@techtrapturemeans having a knowledge of only data related services is not important we have to learn coding ?

    • @techtrapture
      @techtrapture  5 місяців тому

      @@ayushnaphade9419 yes for data engineer role we need coding

  • @user-dq3qw5sl1v
    @user-dq3qw5sl1v 5 місяців тому

    I am not able to find source code in github. Would you pls share the direct link for the same

    • @techtrapture
      @techtrapture  5 місяців тому

      Here is source code
      github.com/vishal-bulbule/automate-gcs-to-bq