Hello Anupam, Great Insight for us aspiring data engineers. Just have a request could you please create a video regarding auto ingestion from Google cloud storage in snowflake just as you have done for this AWS.
Thank you for this video. I appreciate your effort for sharing knowledge .Can you please make some video on Acryl Datahub ,DataHub ,metadata, glossary , domain, data lineage out of the data by ingesting them
Thank you for the video. It is short but has enough information. I have two questions: How does snowflake detect duplicate files? Is there any query to identify those duplicated files processed in Snowflake ?
It keeps track of the file name in the internal metadata tables. check this page if it helps - community.snowflake.com/s/article/COPY-INTO-completes-with-Copy-executed-with-0-files-processed
Thanks for the video . I have a small doubt, lets say if we are using a internal stage and there are couple of different files i.e., employee,salary,dept. All are of type csv. Now if i have to load them to their respective tables in snowflake i.e, employee, salary, dept using pipe, is it possible ? if yes what will be the notification channel?
It is possible to load the data from a stage with different file names using pattern. Please refer this document for more info - docs.snowflake.com/en/sql-reference/sql/copy-into-table#syntax
Thank you for the video, I have a doubt Suppose the source file deleted I mean the file in S3 so are we still able to get the rows when we querying the data ..the select * from ??
if the data is loaded into the snowflake table and then s3 file deleted, we will be able to see the data as the data loaded into the table and can be used for further analysis. it is not like external table where you just define the table structure in the snowflake and actual data reside in the s3.
for that sns. do we need to make anew sns in our aws account to so that it works or that sns is alreayd provided by aws, and also do we need to change trust policy of s3 bucket or add any policy
Thanks for the video, that's very informative :). However I have a doubt.. Let's say my csv file has more no of columns than in the table and since AUTO_INGEST = True, So in this case will it load the data into table or it won't process the file?. If it doesn't process , is there any log file which keeps a track of it?
Please use this link to create Snowpipe anupamkushwah.medium.com/snowflake-snowpipe-automated-data-loading-from-s3-bucket-b395f8d508da If you have any further question, please mention.
@@anupam.kushwah Thanku for your response, we don't want to use S3 or any other cloud, we just need direct integration between mule and snowflake that too using create pipe due to cost concerns
Snowpipe only support loading data from public cloud stages. please refer this documentation - docs.snowflake.com/en/user-guide/data-load-snowpipe-intro
You can load CSV files from laptop using snowsql tool and put command. Follow this link for step by step instructions docs.snowflake.com/en/user-guide/data-load-internal-tutorial
@@anupam.kushwah In the beginning when I was using the copy command to load data from stage to my table. Then it was showing error on integer type columns. But when I took another dataset it works with copy command. Now it is not working with pipe.
Hello Anupam, Great Insight for us aspiring data engineers.
Just have a request could you please create a video regarding auto ingestion from Google cloud storage in snowflake just as you have done for this AWS.
Thank you for this video. I appreciate your effort for sharing knowledge .Can you please make some video on Acryl Datahub ,DataHub ,metadata, glossary , domain, data lineage out of the data by ingesting them
Noted
Thanks a lot, very helpful
Excellent resource !!!
Glad it was helpful!
Very informative video. Thank you.
Glad it was helpful!
Thank you for the video. It is short but has enough information. I have two questions: How does snowflake detect duplicate files? Is there any query to identify those duplicated files processed in Snowflake ?
It keeps track of the file name in the internal metadata tables. check this page if it helps - community.snowflake.com/s/article/COPY-INTO-completes-with-Copy-executed-with-0-files-processed
Hello Anupam, Do we have to create the sqs queue in AWS frst or does creating snow pipe directly creates the queue?
SQS notification channel will automatically created when you create the snowpipe
Thanks for the video . I have a small doubt, lets say if we are using a internal stage and there are couple of different files i.e., employee,salary,dept. All are of type csv. Now if i have to load them to their respective tables in snowflake i.e, employee, salary, dept using pipe, is it possible ? if yes what will be the notification channel?
It is possible to load the data from a stage with different file names using pattern. Please refer this document for more info - docs.snowflake.com/en/sql-reference/sql/copy-into-table#syntax
Thank you for the video, I have a doubt Suppose the source file deleted I mean the file in S3 so are we still able to get the rows when we querying the data ..the select * from ??
if the data is loaded into the snowflake table and then s3 file deleted, we will be able to see the data as the data loaded into the table and can be used for further analysis. it is not like external table where you just define the table structure in the snowflake and actual data reside in the s3.
for that sns. do we need to make anew sns in our aws account to so that it works or that sns is alreayd provided by aws, and also do we need to change trust policy of s3 bucket or add any policy
nice
hi, where can i find the previous video? You are mentioning that in the last session ..
Please check videos on this channel
Thanks for the video, that's very informative :). However I have a doubt.. Let's say my csv file has more no of columns than in the table and since AUTO_INGEST = True, So in this case will it load the data into table or it won't process the file?. If it doesn't process , is there any log file which keeps a track of it?
Hello Princy,
Please follow the below document for more details on it
docs.snowflake.com/en/sql-reference/functions/pipe_usage_history
Thanks sir
can i create snowpipe on internal named stage
no snowpipe is only for external stages
for ingesting multiple different tables to multiple different stages
how to load multiple files with diffrent meta data using one single pipe and single stage its possible ?
It’s better to keep different type of files into different stages and load with separate snowpipes
With one pipe it’s not possible
@@anupam.kushwah ok sir thank you
Hi Bro, I need to configure create pipe connector, especially copy statement; please help me
Please use this link to create Snowpipe anupamkushwah.medium.com/snowflake-snowpipe-automated-data-loading-from-s3-bucket-b395f8d508da
If you have any further question, please mention.
@@anupam.kushwah Thanku for your response, we don't want to use S3 or any other cloud, we just need direct integration between mule and snowflake that too using create pipe due to cost concerns
Snowpipe only support loading data from public cloud stages. please refer this documentation - docs.snowflake.com/en/user-guide/data-load-snowpipe-intro
sir how to load data from internal storage i.e laptop ,csv file?
You can load CSV files from laptop using snowsql tool and put command. Follow this link for step by step instructions
docs.snowflake.com/en/user-guide/data-load-internal-tutorial
i have done everything but data isn't showing in the table
could you please elaborate the problem and share the scripts here
@@anupam.kushwah In the beginning when I was using the copy command to load data from stage to my table. Then it was showing error on integer type columns. But when I took another dataset it works with copy command. Now it is not working with pipe.
@@anupam.kushwah CREATE OR REPLACE TABLE HEALTHCARE(
Patientid VARCHAR(15),
gender CHAR(8),
age VARCHAR(5) ,
hypertension CHAR(20),
heart_disease CHAR(20),
ever_married CHAR(30),
work_type VARCHAR(60),
Residence_type CHAR(30) ,
avg_glucose_level VARCHAR(20),
bmi VARCHAR(20) ,
smoking_status VARCHAR(20),
stroke CHAR(20)
);
CREATE FILE FORMAT CSV
TYPE = CSV;
--create storage integration
CREATE OR REPLACE STORAGE INTEGRATION snowpipe_integration
TYPE = external_stage
STORAGE_PROVIDER = s3
STORAGE_AWS_ROLE_ARN = 'arn:aws:iam::922199053730:role/my-snowpipe-role'
ENABLED = true
STORAGE_ALLOWED_LOCATIONS = ('*');
desc integration snowpipe_integration
--create stage
create or replace stage patient_snow_stage
url = 's3://my-snowflake-bucket-akas'
file_format = CSV
storage_integration = snowpipe_integration;
LIST @patient_snow_stage
show stages
--pull data from stage directly
select $1, $2 from @patient_snow_stage/test1.csv
create or replace pipe patients_snowpipe auto_ingest = TRUE AS
copy into DEMO_DATABASE.PUBLIC.patients3
from @patient_snow_stage
ON_ERROR = 'skip_file';
show pipes
alter pipe patients_snowpipe refresh;
create or replace table patients3 like HEALTHCARE;
select * from patients3;
@@aakashyadav4062 Did you setup the sqs in AWS for snowflake pipe?
@@anupam.kushwah yes i have created the event notification and paste the sqs key of notification channel in it.