The big shift going forward is data sharing. People massively underestimate its potential. If a large organisation such as a Bank incorporates data sharing in the cloud (such as Snowflake) it means that they can publish that data once and everybody with access can use that same data without having to resort to ETL. Presently they construct REST APIs to retrieve the data and all consumers then use it in their multiple ETL process to retrieve it into their own particular database...producing many sets of duplicated data. The classic scenario is Exchange Rates in Banks. Also, Data APIs are often very slow when large data volumes are being used whereas having the data already in a highly performant storage and compute environment would be must faster.
Me too by a massive Pharmaceutical corporate. They are moving all the data from various sources (many of them are SAP systems) into Snowflake in order to mesh the data together :)
In the last month I applied to about 20 positions in Data Engineering. None required Snowflake experience. From the 5 interviews I got so far none was using Snowflake. I've been very curious about learning and have hands-on experience with it. But it is weird that I am not hearing that much about it in the wild.
@@SeattleDataGuy Airflow and Spark. One was 100% GCP so it was Bigquery was considered a bonus. But they seem more interested how good my Python and SQL were.
I've actually wondered why it gets so much hype as well... I even heard non-tech savvy people talk about snowflake, when I don't feel confident that they even know what it is used for.
Amazing video. Snowflake is just a fantastic data solution of improving the shortcomings of traditional cloud scalability and distributed systems. Looking forward to getting my hands on it soon.
I have become a big fan of taking a hybrid approach to clients, using databricks and medallion arch on lake w/ delta tables etc. for bronze (raw) and silver (persisted source aligned version of your data with basic cleansing rules applied), but with the gold layer, taking a use case based approach, with snowflake, redshift, synapse (depending on vendor of choice) operating as a DWH, as a use case for classic BI & managed reporting etc. or alternativley, if an advanced modelling technique is in place (ML, statistical model etc.), this will be powered by databricks and generally have the results stored in the lake gold layer (which could end up in your DWH as well if required). Great video, thanks for sharing your insights as always.
Do you see databricks as being more useful then snowflake? Do you think databricks (or snowflake) could add functionality so that you can essentially do both while only needing one of them? Or do you like them both for their own separate benefits?
Snowflake uses a consumption model. Use more, pay more. Use less, pay less. At 08:00 you say "Snowflake costs have gone up" but strictly speaking,, Snowflake costs themselves have actually gotten cheaper (better compression, the same jobs run faster, etc.). In many cases, this cost efficiency has allowed customers to move additional use cases to Snowflake (likely eliminating costs elsewhere) that they couldn't before. When customers are happy with a platform, and they think they are getting business value out of it, they prioritize moving more workloads to that platform.
There's some good documentation that snowflake provides for python ODBC connections for SQLAlchemy. I've had no issues getting data with snowflake via python. From there you're good to go and do ML.
I think snowflake is a great solution ,but it is still somewhat expensive, specially for small or medium-sized companies that can´t afford a huge budget for their IT processes. Databricks is running outside, as I can see , but it is also a expensive tool.
Oddly enough when I did a poll on linkedin, 30% of 1800 people said they use Databricks as the ingestion layer and Snowflake as the storage layer. These tools to many degrees perform different tasks. Yes databricks wants to be your datalakehouse but it was built from the perspective of a tool for data scientists and now wants to do data management. But snowflake is doing the opposite. So looking a few years back, these solutions weren't even really competing to some degree.
The other three cloud providers offer a much more generalized cloud experience. If you do Snowflake's cert, then you will only learn about DW/Data lake stuff and from snowflakes perspective. Whereas there is so much other stuff to learn. Pick a cloud provider first then snowflake.
At 10:55 he mentions an abbreviation...he says there is competition in the "SMB space"...can anybody tell me what that abbreviation 'SMB' stands for please ? (and yes I googled it, but was unsuccessful)
@MM - usually that's in contrast to enterprise/government--much larger organizations that spend a great deal more and typically have more specialized requirements
in the last 2 months i gave 10-15 interviews. Only 1 spoke of Snowflake (as in there is also snowflake). Most wanted you to know DataBricks, Data Lake and the ETL tools that go along with them.
Glad you liked it! Oh there is someone on my discord who is migrating from terradata to snowflake. You should check out the conversation and ask them! discord.gg/2yRJq7Eg3k
Thats a great question. Interesting point Benoit Dageville, Thierry Cruanes and Marcin Żukowski. Dageville and Cruanes previously worked as data architects at Oracle Corporation. Then from my understanding they proposed some of the initial ideas such as cloud data warehouses to Oracle and they said no (because thats not how oracles business model works). Oracle also has a massive standard transactional DB business that still pretty much runs a lot of the systems we take for granted. But Oracle in general got beat to the cloud by pretty much everyone else. They have tried to catch up but they haven't been as successful as microsoft and azure.
I honestly work with companies that are somehow using both databricks and snowflake. So I rarely think about this as vs. I think most companies will use some combo depending on teams.
@@asianfrenzy666 Haha, yes I often tell people that 4 out of 5 of my clients prefer snowflake. The other one is on bigquery. I think Bigquery is solid and I do have a goal of implementing databricks just haven't had the chance.
@@SeattleDataGuy Bigquery is getting decent, Databricks isn't. Only BQ and SF are real services that are always on, elastic compute, unlimited storage... no need to manage sw, servers, VMs, storage etc. That said if you are coming from a traditional SQL database (Oracle, Teradata..) SF beats everything hands down.
Who doesn't like snow. Brune flake sounds better. I'd argue we appreciate a little snow. But, what's a brune? It's nothing. So let's call a real thing out for what it is. A Brune flake.
Why do you say that snowflake is a data warehouse? Snowflake is an engine and you can build a data warehouse in snowflake. But snowflake itself is not a data warehouse. This might be very confusing for your audience.
Thank you! I look forward to more of your palantir videos to see how people are actually taking data and making it more than just a KPI or a basic model.
From a broke student perspective, am I correct that Databricks is free to download and use so I can train myself on it at no cost...whereas Snowflake only gives a month trial so is impossible to self-learn on the cheap ?
Actually, Snowflake gives you as many free trial accounts as you want (to my knowledge), even with the same email address. While they are restricted to 30 days and the credit limit, nothing stops you from signing up again and again. I just tested it before commenting, I have currently multiple trial accounts under the same email address. They don't actively advertise it, but they do state so in their essential workshops, I believe.
Hi. Great video. Thanks for the insight. I wanted to ask you do you know if Snowflake of any of the services u mentioned offer assistance with account set up? I am in over my head and need some help with this badly. I have multiple nodes running, numerous smart contracts, NFT market places, user data streams, and data warehouses. Any tips or help you could give would be really appreciated. Thanks
Like Redshift, Synapse originated from on-prem code (DataAllegro -> Microsoft PDW) and suffers limitations from that origin. You still need indexes, concurrency is an issue, all compute can't see all data, no live data sharing, no geospatial, the list goes on. Otherwise, Snowflake plays nicely with the Azure ecosystem, you can buy it from the Azure Marketplace, and companies even get credit against their Azure commit when using Snowflake.
I have actually been thinking about this recently...why have certain tools gained popularity quickly like Python and Snowflake. Even if there are arguably better options in other dimensions? And do they have longevity.
Great video! I am not using snowflake right now but I want to give it a go. Small feedback: it'd be amazing if channels like yours start using a more inclusive language by avoiding words like "bros" or "guys". Not everyone in the industry identify with those :-)
Snowflake sucks a big time. In Sep 2021 we tried to make a test on it but even simple examples from their website did not work. Support was like - yeah we will fix it ... one day ... waybe ;) In contrast Google BigQuery works from day one without any issues.
Sounds unlikely. Feel free to link in the comments exactly which (of the thousands of examples in Snowflake's documentation) don't work out of the box. I've been around data warehousing for more than 20 years and Snowflake's documentation is the best I've ever seen, and unlike some vendors, you don't even have to register to see it.
It really wasn't hahaha or at least that wasn't the goal. There are a lot of great options these days. I usually lean on bigquery, snowflake and data bricks..oh and from time to time postgres.
Ahead in what? For the vast majority of use cases, Palantir doesn't compete with Snowflake. In fact, Palantir would be smart to use Snowflake as the base of their offering. For the most part, Palantir is build vs buy, and as Snowflake continues to increase sales in government, and Palantir continues to lose on the commercial side, Palantir is going to find their addressable marketing getting smaller and smaller. IMHO.
@@RogerGoldtoei That's all you can say because you know that when the next quarterly revenue numbers come out, Snowflake will eclipse Palantir. $SNOW Apr 2022: $422.37M revenue up 84% with almost $4B in the bank. Revenue. $PLTR Mar 2022: $446.36M up 31% with just over $2B in the bank. Snowflake is on a much healthier trajectory financials wise and the technology runs itself rather than a team of builders trying to make Palantir work.
@@RogerGoldtoei It doesn't matter how long you "think" about Palantir vs Snowflake, from the next quarter on, Snowflake is going to be impossible to catch from Palantir's standpoint. Palantir has 3000 employees, Snowflake has 4000. Snowflake probably has 3X the size of engineering. Snowflake growth continues at scale. Bookmark this: Palantir survival in a few years will hinge on Palantir replatforming their storage layer to Snowflake.
The big shift going forward is data sharing. People massively underestimate its potential. If a large organisation such as a Bank incorporates data sharing in the cloud (such as Snowflake) it means that they can publish that data once and everybody with access can use that same data without having to resort to ETL. Presently they construct REST APIs to retrieve the data and all consumers then use it in their multiple ETL process to retrieve it into their own particular database...producing many sets of duplicated data. The classic scenario is Exchange Rates in Banks. Also, Data APIs are often very slow when large data volumes are being used whereas having the data already in a highly performant storage and compute environment would be must faster.
You described even much better & clear within a few sentences👍
They are getting bigger and bigger. I work for a big German company and they are switching from SAP HANA to Snowflake.
Oh man, when a German company switches from anything SAP thats a good sign.
@@SeattleDataGuy Absolutely!
Me too by a massive Pharmaceutical corporate. They are moving all the data from various sources (many of them are SAP systems) into Snowflake in order to mesh the data together :)
Wow that’s huge..
hahaha I work for snowflake and can confirm this :D
In the last month I applied to about 20 positions in Data Engineering.
None required Snowflake experience.
From the 5 interviews I got so far none was using Snowflake.
I've been very curious about learning and have hands-on experience with it. But it is weird that I am not hearing that much about it in the wild.
Interesting. What did they require!
@@SeattleDataGuy Airflow and Spark. One was 100% GCP so it was Bigquery was considered a bonus.
But they seem more interested how good my Python and SQL were.
@@tamelo This is true. Hands on Data engineering doesn't use snowflake. They use core features for flexibility.
you opened my mind to snowflake. was helpful to hear an opinion that wasnt from Snowflakes tutorial videos. THANK YOU!
I've actually wondered why it gets so much hype as well... I even heard non-tech savvy people talk about snowflake, when I don't feel confident that they even know what it is used for.
Hopefully this was helpful. What have you found gets used on your projects?
@@SeattleDataGuy For my previous company some sort of stack included in AWS; currently now.. nothing at all 😞
@@LukeBarousse Well here is to hoping you recommend snowflake next..i need that stock pumped! 😅
@@SeattleDataGuy ❄🚀🌕
That's me you're referring to. That's what brought me here. Didn't understand half the video though.
Amazing video. Snowflake is just a fantastic data solution of improving the shortcomings of traditional cloud scalability and distributed systems. Looking forward to getting my hands on it soon.
Yeah! And now there are a lot more options too
I have become a big fan of taking a hybrid approach to clients, using databricks and medallion arch on lake w/ delta tables etc. for bronze (raw) and silver (persisted source aligned version of your data with basic cleansing rules applied), but with the gold layer, taking a use case based approach, with snowflake, redshift, synapse (depending on vendor of choice) operating as a DWH, as a use case for classic BI & managed reporting etc. or alternativley, if an advanced modelling technique is in place (ML, statistical model etc.), this will be powered by databricks and generally have the results stored in the lake gold layer (which could end up in your DWH as well if required).
Great video, thanks for sharing your insights as always.
Thanks! Yeah I imagine the future will be very hybrid
Do you see databricks as being more useful then snowflake? Do you think databricks (or snowflake) could add functionality so that you can essentially do both while only needing one of them? Or do you like them both for their own separate benefits?
Great video! Didn't know about the Streamlit acquisition. DBT looks like Snowflake's next target.+
I could see that. DBT is currently worth about 4 billion(according to VCs) so we shall see.
just a note about Redshift, it was nothing but a lift and shift of Paraccel database. it was not a cloud native database to start with.
Yup!
This is a really good video, good conversation and discussion in the comments too
can you do an dedicated vedio about the difference between snowflake, gcp, aws and azure
I love these types of "how'd we get here" tech topics. Really great!
yeah, I enjoy looking back to the past
Snowflake uses a consumption model. Use more, pay more. Use less, pay less. At 08:00 you say "Snowflake costs have gone up" but strictly speaking,, Snowflake costs themselves have actually gotten cheaper (better compression, the same jobs run faster, etc.). In many cases, this cost efficiency has allowed customers to move additional use cases to Snowflake (likely eliminating costs elsewhere) that they couldn't before. When customers are happy with a platform, and they think they are getting business value out of it, they prioritize moving more workloads to that platform.
show starts at 5:30
So glad I stumbled upon your channel, thanks Seattle Data Guy!
Glad you enjoyed the video!
Snowflake’s introduction of the Snowpark Python API might just change the game 🤔
It might. It's does feel like they are trying to make sure they have an answer for people who prefer databricks.
There's some good documentation that snowflake provides for python ODBC connections for SQLAlchemy. I've had no issues getting data with snowflake via python. From there you're good to go and do ML.
I think snowflake is a great solution ,but it is still somewhat expensive, specially for small or medium-sized companies that can´t afford a huge budget for their IT processes. Databricks is running outside, as I can see , but it is also a expensive tool.
Hey Ben, any thoughts on Synapse? It rarely gets mentioned.
Microsoft doesn't need to market as much. It has distribution by just being Microsoft.
Synapse is a useless product, it doesn't work.
Long time follower on Medium but somehow didn't know that you had a UA-cam channel. Thanks, subbed!
Thank you! I really appreciate the continued support!
Great video! Are you planning to make a video about Microsoft Fabric?
i should start digging into it!
You have so much knowledge, please stay on one tone so one is not distracted
thank you! I am doing my best
Have you ever talked about data virtualization? I am wondering if there is potential in specializing myself in this niche with using Denodo
Not at this point. I have written an article back like 2 years ago
As a neophyte I don’t understand how Snowflake competes with Palantir and Databricks.
Oddly enough when I did a poll on linkedin, 30% of 1800 people said they use Databricks as the ingestion layer and Snowflake as the storage layer. These tools to many degrees perform different tasks. Yes databricks wants to be your datalakehouse but it was built from the perspective of a tool for data scientists and now wants to do data management. But snowflake is doing the opposite. So looking a few years back, these solutions weren't even really competing to some degree.
What do you think of the certifications that snowflake offers and how they compare to other certifications offered by the big 3 cloud providers?
The other three cloud providers offer a much more generalized cloud experience. If you do Snowflake's cert, then you will only learn about DW/Data lake stuff and from snowflakes perspective. Whereas there is so much other stuff to learn. Pick a cloud provider first then snowflake.
@@SeattleDataGuy Awesome, thanks for that feedback and advice!
Where to lear Snowflake??? Could you give an advice?
i think u need to consider using the Legendary Shure SM7B Microphone
Love the video! Could you do one of these on Confluent too?
It's not currently on the list but I will have to consider it!
At 10:55 he mentions an abbreviation...he says there is competition in the "SMB space"...can anybody tell me what that abbreviation 'SMB' stands for please ? (and yes I googled it, but was unsuccessful)
Sorry small and medium business
@MM - usually that's in contrast to enterprise/government--much larger organizations that spend a great deal more and typically have more specialized requirements
Seattle Data Guy, your changing voice tone makes low pitch parts barely audible.
in the last 2 months i gave 10-15 interviews. Only 1 spoke of Snowflake (as in there is also snowflake). Most wanted you to know DataBricks, Data Lake and the ETL tools that go along with them.
if only it had a decent ".dacpac" system and you didn't need to write your own update scripts, what you need yet another tool to deploy for...
Hey Ben
Awesome Video.
Snowflake really has great marketing.
What are your thoughts on Teradata.
Glad you liked it! Oh there is someone on my discord who is migrating from terradata to snowflake. You should check out the conversation and ask them! discord.gg/2yRJq7Eg3k
Since when terradata is cloud native??
How does Snowflake compete with Oracle?
Thats a great question. Interesting point Benoit Dageville, Thierry Cruanes and Marcin Żukowski. Dageville and Cruanes previously worked as data architects at Oracle Corporation. Then from my understanding they proposed some of the initial ideas such as cloud data warehouses to Oracle and they said no (because thats not how oracles business model works). Oracle also has a massive standard transactional DB business that still pretty much runs a lot of the systems we take for granted. But Oracle in general got beat to the cloud by pretty much everyone else. They have tried to catch up but they haven't been as successful as microsoft and azure.
Great video nicely summarized
Glad you enjoyed it!
@@SeattleDataGuy how about a similar one on databricks ?
Hi , what is your thought on snowflake vs spark
I honestly work with companies that are somehow using both databricks and snowflake. So I rarely think about this as vs. I think most companies will use some combo depending on teams.
@@SeattleDataGuy thanks for info
I've done multiple Snowflake implementations and I still don't get the hype
Which tool do you prefer?
@@SeattleDataGuy Anything not Teradata (>__
@@asianfrenzy666 Haha, yes I often tell people that 4 out of 5 of my clients prefer snowflake. The other one is on bigquery. I think Bigquery is solid and I do have a goal of implementing databricks just haven't had the chance.
@@SeattleDataGuy Bigquery is getting decent, Databricks isn't. Only BQ and SF are real services that are always on, elastic compute, unlimited storage... no need to manage sw, servers, VMs, storage etc. That said if you are coming from a traditional SQL database (Oracle, Teradata..) SF beats everything hands down.
Still waiting to see if the Firebolt hype pans out.
How is Hadoop not a database?
damn the blue yeti is as bad as people say it is
Do you have any take on Dremio?
Dremio is solid. I think I have personally leaned towards Trino but I am biased there.
Who doesn't like snow.
Brune flake sounds better.
I'd argue we appreciate a little snow.
But, what's a brune?
It's nothing.
So let's call a real thing out for what it is.
A Brune flake.
Codestrap brought me here! You need to do a collab
That would be awesome!
Come join the discord, codestrap is very active
Why do you say that snowflake is a data warehouse? Snowflake is an engine and you can build a data warehouse in snowflake. But snowflake itself is not a data warehouse. This might be very confusing for your audience.
Awesome introduction, 👌
Glad you enjoyed it!
Awesome video! Sharing!
Thank you! I look forward to more of your palantir videos to see how people are actually taking data and making it more than just a KPI or a basic model.
From a broke student perspective, am I correct that Databricks is free to download and use so I can train myself on it at no cost...whereas Snowflake only gives a month trial so is impossible to self-learn on the cheap ?
There is a free tier of Databricks. Check out this link docs.databricks.com/getting-started/try-databricks.html
Actually, Snowflake gives you as many free trial accounts as you want (to my knowledge), even with the same email address. While they are restricted to 30 days and the credit limit, nothing stops you from signing up again and again. I just tested it before commenting, I have currently multiple trial accounts under the same email address. They don't actively advertise it, but they do state so in their essential workshops, I believe.
1:00 TO snowflake (love the dramatic change in volume here lol)
Hopefully didn't burst any ear drums. I am testing out some new mics
Excellent overview. Thank you.
Glad it was helpful!
Hi. Great video. Thanks for the insight. I wanted to ask you do you know if Snowflake of any of the services u mentioned offer assistance with account set up? I am in over my head and need some help with this badly. I have multiple nodes running, numerous smart contracts, NFT market places, user data streams, and data warehouses. Any tips or help you could give would be really appreciated. Thanks
Based on their financial statements, they are known for running a charity.
I still dont get the advantages over Azure Synapse and generally whole Azure Data Platform
Like Redshift, Synapse originated from on-prem code (DataAllegro -> Microsoft PDW) and suffers limitations from that origin. You still need indexes, concurrency is an issue, all compute can't see all data, no live data sharing, no geospatial, the list goes on. Otherwise, Snowflake plays nicely with the Azure ecosystem, you can buy it from the Azure Marketplace, and companies even get credit against their Azure commit when using Snowflake.
Those snowflake conferences at four seasons in Seattle always had the best food 😂
Lol, better check out the next one
In short, make your product simple and easy to use
I have actually been thinking about this recently...why have certain tools gained popularity quickly like Python and Snowflake. Even if there are arguably better options in other dimensions? And do they have longevity.
@@SeattleDataGuy simplicity to learn. Is like Vb back in the 70s and 80s
snowflake:DE::automl/g3pt:DS automation?
I don't know if I would go that far. It would likely be closer to premade model libraries where you just import a model vs writing it out by hand.
bruh the fridge 💀
hows the job market for snowflake?
For now its great. But trends come and go quickly. Focus on the basics then add tool specific skills.
@@SeattleDataGuy thank u , can u emphasise more on basic skills pls ?
@@abdullahsiddique7787 I would say sql to the core, pyspak and python is all you need.
@@lahvoopatel2661 thanks Gautam any good tutorial for pyspark can u suggest
That Graham Stephen and MeetKevin analogy was so fucking cringe
Thanks
No problem!
Great video! I am not using snowflake right now but I want to give it a go.
Small feedback: it'd be amazing if channels like yours start using a more inclusive language by avoiding words like "bros" or "guys". Not everyone in the industry identify with those :-)
"guys" is also used as a general term for people in general.
@@thecanadakid7622 yes it is, but don't you think we can do better than just stick with things that may be improved?
Snowflake sucks a big time. In Sep 2021 we tried to make a test on it but even simple examples from their website did not work. Support was like - yeah we will fix it ... one day ... waybe ;) In contrast Google BigQuery works from day one without any issues.
Sounds unlikely. Feel free to link in the comments exactly which (of the thousands of examples in Snowflake's documentation) don't work out of the box. I've been around data warehousing for more than 20 years and Snowflake's documentation is the best I've ever seen, and unlike some vendors, you don't even have to register to see it.
Snowflake 1000+ per share!
Plot twist, this was a marketing pitch for snowflake.
It really wasn't hahaha or at least that wasn't the goal. There are a lot of great options these days. I usually lean on bigquery, snowflake and data bricks..oh and from time to time postgres.
I just struggle with your accent.
Palantir is way far ahead
Ahead in what? For the vast majority of use cases, Palantir doesn't compete with Snowflake. In fact, Palantir would be smart to use Snowflake as the base of their offering. For the most part, Palantir is build vs buy, and as Snowflake continues to increase sales in government, and Palantir continues to lose on the commercial side, Palantir is going to find their addressable marketing getting smaller and smaller. IMHO.
@@StephenPace1 lmfao
@@RogerGoldtoei That's all you can say because you know that when the next quarterly revenue numbers come out, Snowflake will eclipse Palantir. $SNOW Apr 2022: $422.37M revenue up 84% with almost $4B in the bank. Revenue. $PLTR Mar 2022: $446.36M up 31% with just over $2B in the bank. Snowflake is on a much healthier trajectory financials wise and the technology runs itself rather than a team of builders trying to make Palantir work.
@@StephenPace1 yup. I don't think in quarters
@@RogerGoldtoei It doesn't matter how long you "think" about Palantir vs Snowflake, from the next quarter on, Snowflake is going to be impossible to catch from Palantir's standpoint. Palantir has 3000 employees, Snowflake has 4000. Snowflake probably has 3X the size of engineering. Snowflake growth continues at scale. Bookmark this: Palantir survival in a few years will hinge on Palantir replatforming their storage layer to Snowflake.