Create an external data stage (Azure) on Snowflake

Поділитися
Вставка
  • Опубліковано 16 січ 2025

КОМЕНТАРІ • 25

  • @KahanDataSolutions
    @KahanDataSolutions  4 роки тому +1

    Want to build a reliable, modern data architecture without the mess?
    Here’s a free checklist to help you → bit.ly/kds-checklist

  • @satyagouda9494
    @satyagouda9494 8 місяців тому +1

    Thank you so much! Very Simple example and easy to understand!

  • @pilarbaldominos2505
    @pilarbaldominos2505 2 роки тому +1

    Thanks for sharing … you do not know how much you has helped me to start my journey with snowflake!

    • @KahanDataSolutions
      @KahanDataSolutions  2 роки тому

      That's great to hear. I'm glad to hear it has been helpful for you - thanks for watching!

  • @mkillough67
    @mkillough67 Рік тому +1

    Great video, thank you, but how do you automate the addition of your input CSV (in this case) file to Azure without having to manually upload it?

  • @masoudghodrati3770
    @masoudghodrati3770 2 роки тому +1

    great vid, thanks.
    can't the copy/insert done automatically? like as soon as a new file added, we can see data added to the table?

  • @KahanDataSolutions
    @KahanDataSolutions  4 роки тому +3

    **NOTE**:
    In this video I indicate there are storage costs associated for holding on to the staged data. However, that would actually not apply for "External" stages (such as Azure Blob, S3, etc.). Rather, this type of stage would be acting as a pointer to that cloud location. You would, however, incur storage costs if it is an "Internal" stage where Snowflake would actually be holding the data.
    Also, purging/removing from an "External" stage WILL DELETE the file in the cloud storage as well. This example was meant to show how that syntax works, but be cautious before running that step if it is critical data!

  • @ihafidh
    @ihafidh 4 роки тому +3

    Great video again. Question though, are you sure that the stage will add to Snowflake cost? You created an “external” stage which was a pointer to an Azure blob. That should not add to the storage cost in Snowflake.
    Had you created an “internal” stage (which is stored in Snowflake) then it would add to storage cost.

    • @KahanDataSolutions
      @KahanDataSolutions  4 роки тому +2

      Thank you for pointing this out! You are correct. For the external stage it would only be storing the metadata of the pointer, not data itself. I will add a note to the description calling that out. Appreciate you watching!

  • @RichardMartinez-o2e
    @RichardMartinez-o2e 3 місяці тому

    and this is all done with wide open network access to your BLOB?

  • @SuperICHmagdich
    @SuperICHmagdich 2 роки тому

    Hi Kahan great video as always. Am I correct that copying into a table means that under the hood the data is transferred to the internal table stage from the external stage?

    • @KahanDataSolutions
      @KahanDataSolutions  2 роки тому +1

      I'm not 100% positive, but my understanding is that an external stage is not transferred to an internal stage, but rather copied over directly to your table from the endpoint location (ex. S3 bucket).

    • @SuperICHmagdich
      @SuperICHmagdich 2 роки тому

      The copy command than is kind of unfortunately named as a table is not a real storage location. Consequently deleting data from eg. S3 should result in this beeing reflected in the snowflake table am I right? Maybe some clarification regarding this is worth raising a Snowflake ticket for.

  • @user-ug64e3st87op
    @user-ug64e3st87op 10 місяців тому

    Hi! How to copy Incremental ? For example I have 1.000.000 line in table in Snowflake, then in Azure DB was added one more line and now in Azure DB 1.000.001! How to add just this one last line to snowflake without copy aaaa---aaaa---ll data?

  • @satyaraj1427
    @satyaraj1427 3 роки тому

    Hey the video is great.....but i have a doubt..is there a way to load the csv files directly? like without creating tables in the worksheet and loading them, is there a way to do the direct file load from blob storage? please help me with this....Thank you

  • @noel_g
    @noel_g 4 роки тому

    Does purging remove it from the blob storage container?

    • @KahanDataSolutions
      @KahanDataSolutions  4 роки тому +2

      In this scenario, yes purge = true would also remove it from the External stage (blob). If you do not want that to happen you can leave that part out, and Snowflake will still be smart enough to only work on the newly added files.
      Note: I accidentally misspoke on this video by saying there are storage costs associated to "External" stages (Blob, S3, etc.), which there is not. Costs would only apply to "Internal" stages. External stages would simply be a pointer to that Blob container and not be holding the data itself. Therefore, it is less critical to be constantly purging that type of stage.

  • @nishavkrishnan4271
    @nishavkrishnan4271 3 роки тому

    I am getting an error when i list "Stage" .Error is Failure using stage area. Cause: [Server failed to authenticate the request. Make sure the value of Authorization header is formed correctly including the signature. (Status Code: 403; Error Code: AuthenticationFailed)]
    Can you please help ?

  • @hyuntaekjoo6034
    @hyuntaekjoo6034 4 роки тому

    Good tutorial! Thank you!
    Can I have a question? Is it possible to copy tables, not data?
    Because just importing a CSV file itself, it looks better. It saves 1 step.

    • @KahanDataSolutions
      @KahanDataSolutions  4 роки тому

      Thank you for watching, and yes it is! It sounds like you may be looking to use the "clone table" feature. This video will help explain / walk you through it - ua-cam.com/video/aVZxk8tz5p8/v-deo.html

  • @OfficialSnowsix
    @OfficialSnowsix 3 роки тому

    I noticed on the create stage portion, there was a snowflake managed option. What if we use snowflake but not azure or aws? Would we then just use snowflake managed? Would it just be a repository via snowflake instead of s3 or blob storage?

    • @KahanDataSolutions
      @KahanDataSolutions  3 роки тому

      That's correct. This option refers to Internal Snowflake stages which would indicate that the data is not originally stored in an external location such as S3 or BLOB, but rather uploaded directly to Snowflake. Here are some links with some more information:
      - docs.snowflake.com/en/sql-reference/sql/create-stage.html
      - docs.snowflake.com/en/user-guide/data-load-local-file-system-create-stage.html

  • @noel_g
    @noel_g 4 роки тому

    How would this differ if the file was parque? E.g. is there a way to create the table automatically since column info is already in the file metadata?

    • @KahanDataSolutions
      @KahanDataSolutions  4 роки тому

      Hi Noel - The process would be very similar, except you would need to create a File Format for "Parquet" instead of "CSV" and the copy script would require some parsing. I will have a video next week on working w/ JSON which will cover similar functionality.
      For now, here is a link to the Snowflake docs that explain it in more detail. - docs.snowflake.com/en/user-guide/script-data-load-transform-parquet.html
      If your goal is to automate this process, I also suggest looking into auto-ingesting data w/ Snowpipe.
      Hopefully between this video and the documentation it will get you going in the right direction!

    • @noel_g
      @noel_g 4 роки тому +1

      @@KahanDataSolutions thanks. I guess we can’t get away from defining the table first. I thought since schema info and data types are there already a step could be skipped.