once you get them into the data warehouse, you can run a set of sprocs to cleanup/transform the data. don't understand yet why this is any better than just a few sql queries. I am not being negative, I really appreciate the talk.
Yep you absolutely can do this. However, dbt adds many additional benefits, right integration with source control for CI/CD, automated testing and documentation, dependency management between jobs is taken care for you as part of orchestration to name a few benefits. It comes down to your specific requirements and what you need to achieve really, it’s never black and white.
dude how r you, i have the next question, what could i do if i have a stream on snowflake that i want to "consume" in dbt but not creating a physical table or view, instead something live a ephemeral materialization, only to purge the stream and avoid to become stale. I create an ephemeral model and select the stream source but that only create obviously an ephemeral materialization but kind not clean the data on the stream, thoughts??
I"m confused is DBT just a query builder? If I use raw SQL in my webapp instead of an ORM. Where would DBT fit in? Would DBT be used to switch from raw SQL queries when fetching my data variables e.g. sql_data_python_object = raw_sql_query(against existing DB). So instead of raw_sql_query I would now write a dbt_query? But the benefit is that this query works on more DBs than say a raw pg query. So is DBT a query builder which works like an ORM in the sense that you can use the query on any DB? If not I don't see the purpose for it when I could just use a raw SQL query on the datawarehouse after the pipeline has finished filling up the datawarehouse.
A query builder is a fraction of what it can do. It has tight integration with version control which opens the door to DataOps and CI/CD. You have a range of different materializations which can be switched in an instant using a configuration parameter across a database schema or even individual objects. You can automate testing and leverage out of the box standard tests easily as well as creating your own custom tests. Job dependencies and orchestration is automatically handled for you, along with the flexibility of which models you wish to build, run or test. On top of that, you’ve got automated documentation, which can be generated using the single command. I’m sure there’s more but that’s just off the top of my head.
Thanks for your question! Off the top of my head - orchestration between models is taken care of, auto documentation generation, versioning of code with tight integration to bitbucket or GitHub, ease of applying different kinds of materializations
Cloud data platforms are not the reasons why data teams started to do ELT instead of ETL. ELT has been performed by data vault practitioners since 2001.
The point being you were still constrained by on prem resources regardless of your approach. ELT was a way of adopting a pattern to work around the constraints of moving data out of the db to your ETL infrastructure and back again. The elastic scalability of the cloud opened up new possibilities when adopting elt patterns
Hi @adam - I'm trying to pass a list of queries as an input to macro in my dbt project, the query is going to run on Big query in the background - I want to handle the exception in the macro - on how to through an exception if the query is having some sort of syntax issue or if table names are missing in the warehouse. I tried various ways but can't figure out a suitable solution to it. Cuz if one query is failing it is causing the process to stop, and what I want that, if a Query fails to execute it moves to the next item in the list
once you get them into the data warehouse, you can run a set of sprocs to cleanup/transform the data. don't understand yet why this is any better than just a few sql queries. I am not being negative, I really appreciate the talk.
Yep you absolutely can do this. However, dbt adds many additional benefits, right integration with source control for CI/CD, automated testing and documentation, dependency management between jobs is taken care for you as part of orchestration to name a few benefits.
It comes down to your specific requirements and what you need to achieve really, it’s never black and white.
Love the approach of using whiteboard and marker! Thanks for the simple yet effective intro :)
Glad you like it!
The value that you are providing is just ridiculous. Thanks for the straightforward explanation Adam
CD stands for Continuous Delivery or Continuous Deployment, not Development
wow! Plain and simple, just as valuable knowledge should be. Thanks!
Very good and consistent intro to the dbt topic, thanks!
your video was exactly what i was looking for. Amazing job Adam :)
why couldn't I find this channel earlier? amazing explanations! 10 mins into the video and I am a fan.
You’re giving us data breaking news in a such an absorbed way. Thanks!
I love the way you laid this all out.
Thanks a lot. I understood it well. Very well explained!
Great explanation! you won a suscriber, thanks for the content!
Thanks Adam, Simple yet good insights on DBT. Look forward for more on this.
Very useful introduction.. Thanks!
Thanks Adam. Good introduction to DBT
dude how r you, i have the next question, what could i do if i have a stream on snowflake that i want to "consume" in dbt but not creating a physical table or view, instead something live a ephemeral materialization, only to purge the stream and avoid to become stale. I create an ephemeral model and select the stream source but that only create obviously an ephemeral materialization but kind not clean the data on the stream, thoughts??
I"m confused is DBT just a query builder?
If I use raw SQL in my webapp instead of an ORM. Where would DBT fit in? Would DBT be used to switch from raw SQL queries when fetching my data variables e.g. sql_data_python_object = raw_sql_query(against existing DB). So instead of raw_sql_query I would now write a dbt_query? But the benefit is that this query works on more DBs than say a raw pg query.
So is DBT a query builder which works like an ORM in the sense that you can use the query on any DB? If not I don't see the purpose for it when I could just use a raw SQL query on the datawarehouse after the pipeline has finished filling up the datawarehouse.
A query builder is a fraction of what it can do.
It has tight integration with version control which opens the door to DataOps and CI/CD.
You have a range of different materializations which can be switched in an instant using a configuration parameter across a database schema or even individual objects.
You can automate testing and leverage out of the box standard tests easily as well as creating your own custom tests.
Job dependencies and orchestration is automatically handled for you, along with the flexibility of which models you wish to build, run or test.
On top of that, you’ve got automated documentation, which can be generated using the single command.
I’m sure there’s more but that’s just off the top of my head.
Thanks Adam, it's useful and informative, I have one doubt, on Snowflake also we can do transformations, what is the reason to opt for DBT, thank you,
Thanks for your question!
Off the top of my head - orchestration between models is taken care of, auto documentation generation, versioning of code with tight integration to bitbucket or GitHub, ease of applying different kinds of materializations
@@mastering_snowflake thanks for the clarification❤,
Cloud data platforms are not the reasons why data teams started to do ELT instead of ETL. ELT has been performed by data vault practitioners since 2001.
The point being you were still constrained by on prem resources regardless of your approach. ELT was a way of adopting a pattern to work around the constraints of moving data out of the db to your ETL infrastructure and back again.
The elastic scalability of the cloud opened up new possibilities when adopting elt patterns
@@mastering_snowflake but ELT could be performed before cloud data platforms.
Great video, it will help more if you can tell why dbt should be used over snowflake native transformation options like procedure etc.?
Excellent explanation how DBT works for performing transformation in ELT process in snowflake
Hi @adam -
I'm trying to pass a list of queries as an input to macro in my dbt project, the query is going to run on Big query in the background - I want to handle the exception in the macro - on how to through an exception if the query is having some sort of syntax issue or if table names are missing in the warehouse. I tried various ways but can't figure out a suitable solution to it. Cuz if one query is failing it is causing the process to stop, and what I want that, if a Query fails to execute it moves to the next item in the list
Great vid. Do you have an opinion on Paradime ? I’m starting to see that appear. What’s your take on this and the value add ?
Never heard of it!
Thanks for the details very informative
Ok Good ,Intéréssant !
sir your audio is bad is this problem with youtobe
I think it was just UA-cam for you as no one else has mentioned this. Thank you!
great!
oloco, parece o Adam Sandler
obrigado