- 55
- 240 316
James Serra
Приєднався 14 вер 2006
Microsoft data platform
Відео
Microsoft Fabric introduction
Переглядів 2,4 тис.11 місяців тому
Microsoft Fabric is the next version of Azure Data Factory, Azure Data Explorer, Azure Synapse Analytics, and Power BI. It brings all of these capabilities together into a single unified analytics platform that goes from the data lake to the business user in a SaaS-like environment. Therefore, the vision of Fabric is to be a one-stop shop for all the analytical needs for every enterprise and on...
Microsoft Fabric: Lakehouse vs Warehouse
Переглядів 15 тис.Рік тому
In Microsoft Fabric, I see a lot of confusion about the differences between Lakehouse and Warehouse - when to use what. I created this video to give a brief overview of each, then dive into the differences that hopefully will clear things up for you. The deck from this presentation can be found here: publicstoragejs.blob.core.windows.net/presentations/Microsoft Fabric - Lakehouse vs Warehouse -...
Microsoft Fabric introduction
Переглядів 20 тис.Рік тому
Microsoft Fabric is the next version of Azure Data Factory, Azure Data Explorer, Azure Synapse Analytics, and Power BI. It brings all of these capabilities together into a single unified analytics platform that goes from the data lake to the business user in a SaaS-like environment. Therefore, the vision of Fabric is to be a one-stop shop for all the analytical needs for every enterprise and on...
Data Lakehouse, Data Mesh, and Data Fabric - James Serra - Data Toboggan - March 2023
Переглядів 3,1 тис.Рік тому
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a modern data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. They all may sound great in theory, but I’ll dig into the concerns you need to be aware of before taking the plunge. I’ll also include use cases so you can s...
Big Data Architectures and The Data Lake - James Serra - PASS Cloud Virtual Group - Aug 2019
Переглядів 841Рік тому
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I’ll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a da...
Should I move my database to the cloud? - James Serra - SolarWinds - Aug 2018
Переглядів 326Рік тому
So you have been running on-prem SQL Server for a while now. Maybe you have taken the step to move it from bare metal to a VM, and have seen some nice benefits. Ready to see a TON more benefits? If you said “YES!”, then this is the session for you as I will go over the many benefits gained by moving your on-prem SQL Server to an Azure VM (IaaS). Then I will really blow your mind by showing you ...
Azure Synapse Analytics: A Data Lakehouse - James Serra - Triangle SQL Server UG - Aug 2020
Переглядів 921Рік тому
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, pr...
Azure Synapse Analytics Overview: A Data Lakehouse - James Serra - Miami Data and AI UG - Oct 2020
Переглядів 393Рік тому
Azure Synapse Analytics is Azure SQL Data Warehouse evolved: a limitless analytics service, that brings together enterprise data warehousing and Big Data analytics into a single service. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources, at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, pr...
Data Warehousing Trends, Best Practices, and Future Outlook - James Serra - BI Conference - Jun 2022
Переглядів 2,3 тис.Рік тому
Over the last decade, the 3Vs of data - Volume, Velocity & Variety has grown massively. The Big Data revolution has completely changed the way companies collect, analyze & store data. Advancements in cloud-based data warehousing technologies have empowered companies to fully leverage big data without heavy investments both in terms of time and resources. But, that doesn’t mean building and mana...
Data Warehousing Trends, Best Practices, and Future Outlook - James Serra - Florida UG - April 2022
Переглядів 475Рік тому
Over the last decade, the 3Vs of data - Volume, Velocity & Variety has grown massively. The Big Data revolution has completely changed the way companies collect, analyze & store data. Advancements in cloud-based data warehousing technologies have empowered companies to fully leverage big data without heavy investments both in terms of time and resources. But, that doesn’t mean building and mana...
Data Lakehouse, Data Mesh, and Data Fabric - James Serra - Azure Community Conference - Oct 2021
Переглядів 146Рік тому
Data Lakehouse, Data Mesh, and Data Fabric - James Serra - Azure Community Conference - Oct 2021
Data Lakehouse, Data Mesh, and Data Fabric - James Serra - Calgary Azure Analytics UG - June 2022
Переглядів 1,3 тис.Рік тому
Data Lakehouse, Data Mesh, and Data Fabric - James Serra - Calgary Azure Analytics UG - June 2022
Data Lakehouse, Data Mesh, and Data Fabric - James Serra - DataMinutes - Jan 2022
Переглядів 374Рік тому
Data Lakehouse, Data Mesh, and Data Fabric - James Serra - DataMinutes - Jan 2022
The Alphabet Soup of Data Architectures - James Serra - Podcast - Aug 2021
Переглядів 143Рік тому
The Alphabet Soup of Data Architectures - James Serra - Podcast - Aug 2021
Data Lakehouse: Debunking the Hype - James Serra - Podcast - Aug 2021
Переглядів 162Рік тому
Data Lakehouse: Debunking the Hype - James Serra - Podcast - Aug 2021
The Rise of Data Mesh: Panel discussion - James Serra - Decisive 2022 - June 2022
Переглядів 116Рік тому
The Rise of Data Mesh: Panel discussion - James Serra - Decisive 2022 - June 2022
Interview on data mesh and data warehousing - James Serra - UNION: The Data Fest - Nov 2022
Переглядів 132Рік тому
Interview on data mesh and data warehousing - James Serra - UNION: The Data Fest - Nov 2022
Data Mesh, Data Fabric, Data Lakehouse - SQLBits 2022
Переглядів 52 тис.2 роки тому
Data Mesh, Data Fabric, Data Lakehouse - SQLBits 2022
By far the best introduction to Fabric I have seen. Cuts out the marketing fluff and covers real design insights.
Hi James, Thanks for this video, it complies well with my thoughts. FYI I had or have all 3 Mesh Types in different productions. Mesh Type 1 uniformity and centralization is good to start fast from scratch as a Data Mesh core. Mesh Type 2 distribution and its domain and storage total uniformity are too optimistic for real life and have limited application. Mesh Type 3 is too agile and diverse, so it may introduce a high total cost of ownership if it is selected as a core for Data Mesh, however, if there is a hybrid cloud, 3rd party SaaS API, a custom HPC cluster, a "good enough" legacy system, or security/privacy constraints IMHO it is reasonable to use Mesh Type 1 as a core and combine it with Mesh Type 3 plugins, where each Mesh Type 3 plugin mimics Mesh Type 1 approach for its input and output data products. I prefer using Mesh Type 1 + 3 as it offers centralized control, cost-efficiency, and fast development. This is achieved by utilizing the Mesh Type 1 core as a default and having the flexibility to extend the core with diverse Mesh Type 3 domain plugins, giving us agile capability.
Hi Denis...thanks for the feedback! I'll definitely expand on the 3 mesh types in my book for the next edition 🙂
Great information on this video. I have seen so many other channel and video saying bla bla about the Fabric but this video looks the most informative. guys like and subscribe for this effort
Thanks for the kind words Sanish!
ua-cam.com/video/VYmjJe2gR1A/v-deo.html Unity Catalog in Databricks has solved the RLS and Column Masking. A very impressive and interesting video. For me it served a good recap :) Cheers Thank You
I work as a data engineer in Snapp Group in Iran. I recently read your book and I want to thank you for your wonderful book. The contents of your book were very close to my experiences in recent years and I enjoyed the events.
Thanks for the kind words Hamed!
Great video, finally understand what this
End of the day, new warehouse FEELS almost like the serverless sql endpoint with the ability to also now write data with T-SQL in a DEFAULT DELTA format + the Polaris engine + the ability to use Snowflake like cloning to make test environments a lot easier in the same way Juan Soto should make things easier for the Yankees? 🤣
Nicely explained! Thanks
That was a great explanation. Thanks for this.
Crystal clear explanation.
Yes, that WAS a very clear presentation : I really was confused between L and Ws. He anticipated my question of why the 2 instead of just one. We need the behind the scenes details (like this) to grok what is going on.
Glad you found it helpful!
Thank you sir, it's now much way more clear.
Thank you for the amazing video..
I want to say a big thank you for incredibly useful and easy explanations. Well done.
Thanks for the kind comment!
my ex husband Vincent is not doing very well now. I still love him very much. Please if you see this keep him in your prayers. Lee Runnels
Hi, thanks for the great content! in terms of performance of querying and transforming data, which of the two is preferred? Or does it come down to comparing Spark vs T-SQL?
It does come down to Spark vs T-SQL, which Spark usually faster for transforming data
Great! Thank u.
💃 "Promo SM"
Huge regrets for not finding this video earlier. The best have seen so far. yesterday got your book from O'Reilly. Spent night reading it and all I can say that such book and subject was very much required. thank you very much James. Have started following you on LinkedIn.
“things i like behind me” for some reason i thought to myself he likes printers 😂 … jk this is a great video Thank you!
Nice talk to understand the high-level how all connected together and compare with existing offering.
Amazing way of explaining with granular details. Thanks
Thanks for the explanation. It's really helpful.
Great intro video!! Fully agree on the comment regarding dedicated SQL pool!! We made a touch but wise decision to stay away from it for new analytics projects kicked off at the beginning of this year!!
Great Presentation , thank you
BY FAR THE SINGLE MOST EFFECTIVE EXPLANATION EVER ❤
BY FAR THE SINGLE MOST EFFECTIVE EXPLANATION EVER ❤
How can I get access to the links on the presentation?
This was incredibly helpful! I am prepping to teach the DP-500 and one of the things I am trying to figure out is how to compare/contrast/talk about "traditional" Synapse vs. Fabric. Your insights really help!
Awesome content which not only explain the technical nuances but also explain the use case of Lakehouse versus Warehouse from the architecture design perspective. Enjoyed watching video and got lots of clarity. Thanks James
Thank you! I was thinking that the Warehouse was more like a dedicated pool. I'm hoping it is a bit more robust than the current Synapse Serverless.. at least once we get more T-SQL support..
Finally, I understood the whole buzz around Fabric. Main differentiators from Synapse look like SaaS | OneLake | Dropping MPP aka Dedicated Pool | Compute/storage decoupling (official) |...plue few more. I am not sure how well "Auto discovery and registration of table" feature will work - specially if metastore already exists in Databricks. Will Databricks share its metastore with Fabric or we will recreate metastore here? Also, will metastore be at workspace level or tenant level?
If you were to create a shortcut to a CSV file , right now it seems that you have to manually create the table, and it does not update if the source file changes. It's quite common that we have folders in the datalake containing versions of a csv. Sales_01 , Sales_02 etc. In serverless sql you could just target them with openrowset and an asterisk. Sales_* would just load all the files in the folder. And since it was a view you would always gaurantee the data was not stale. Are there plans to improve the way data is loaded as tables in the lakehouse to support this?
as an old-school EDW guy, you cut through much of the terminology fuzziness and answered several questions I had. Thank you
Yup : the terminology fuzziness ...source of soooo much confusion, and ''wrong thinking"
Thanks for a simple yet insightful session
Great video! Can you please post the links in your Data Mesh section (46:37) or the presentation as a whole? Thanks.
Glad you like it! I posted the deck link in the description.
Great overview! Thankful for the 'Azure-Translation' (i.e., showing how all the products fit into the overall data flow)!
good explanation, thank you
Does MS Fabric supports things like Private connections to on-prem, key vaulted credentials/configs like Azure does? Is it something we can expect to see in future if not there already?
That was as clear as you could get. Thanks for the great work!
very clear! Thank you James
Thanks for this, it was very helpful. Showing the sql endpoint on the lakehouse editor was a revelation.
Excellent content. I've been using Synapse for a while and it took me sometime to get through all the different options and limitations. This video helps clarifying a lot about Fabric. Can't wait for the book to be published. Thank you.
As always, you break down what has been a source of confusion with clarity - Thanks!
Thanks for the kind words!
Hi Great content - Why not to use both Warehouse and Lakehouse? for example DataBricks that uses Lakehouse and AWS-RDS for Warehouse
Hi James, first of all, great video! Most of our current dashboards are built on Tableau which connects to our SQL database. If we moved to Fabric and used OneLake, would we still be able to connect to Tableau?
Yes, Tableau can pull in data from OneLake, since the data in OneLake is stored in delta format, which Tableau can read
Hi James, thanks for the good introduction! Good insights into the architecture of Fabric/OneLake. Will it be possible to use data virtualization as a layer between on-premise SQL Server 2022 and OneLake (maybe, in conjunction with the new feature Shortcuts). I remind myself, that SQL Server 2022 is able to use the Polybase v3 feature in conjunction with ADLSv2 access (virtually), but since OneLake is somehow extra/segregated from ADLSv2, I doubt that it will be possible, at the moment. A workaround for me would be to data virtualize between on-premise and ADLSv2, and then shortcut/"bridge over logically" to OneLake (hosted in Fabric capacity). Thanks in advance!
Hi Thomas, I expect you will eventually see Synapse link integrated into Fabric. Synapse Link supports data virtualization to SQL Server 2022. Your workaround will do in the meantime 🙂
6:39
Am still with Vincent. Love him so much.
ua-cam.com/video/a6A3jtvB62U/v-deo.html "Do not [...] use Pipelines within Synapse" I hope we are fine as long as we don't use (Synapse) Data Flow activities.
Hi Martin, I would also avoid using Synapse pipelines and use ADF pipelines instead. This is because there will be a migration tool for ADF pipelines to Fabric much sooner than a migration tool for Synapse pipelines to Fabric
Once again, Great and useful video! Thank you James!