SQL Server in Microsoft Fabric? Use CETaS to move data!

Поділитися
Вставка
  • Опубліковано 15 гру 2024

КОМЕНТАРІ • 50

  • @chainpeu
    @chainpeu Рік тому +16

    Using SQL Server On-Prem as Power BI source data must be one of the most extended use cases out there. Before Fabric exposing SQL Server tables to Power BI was very straight forward using Power BI Gateways.
    Seems that now with fabric we need to create SQL Server partitions, then provision BLOB storage in ADLS, then create a CETaS to create that partition as a parquet file on ADLS, then expose this to OneLake using shortcuts, then format again the parquet virtualized files in OneLake using a delta table, then create the Power BI data model in Power BI and only then you can create a Power BI report out of a SQL Server table. Am I the only one that sees this as some convoluted crazy overengineering process?
    Probably there are other ways using Azure Pipelines to copy data from SQL Server On-Prem to OneLake (not in preview I believe). But even when that is GA, the concept of having to copy your on-prem SQL Server structured data to a BLOB storage persistence layer to do simple reporting looks twisted. Companies that want to embrace Fabric are going to have to re-engineer their ETL processes to write on OneLake and deal with a heck of one-time data load for the existing historical data. Just to reach probably a worse performance scenario than the one they already have.
    Direct lake connectivity might be the one saving grace on this whole thing... we will have to see. But as of now I'm very sceptical and a bit pissed at MS with their strategy to push everyone (eventually) to this one Lake architecture that in many use cases is absolutely not required.

    • @noahhadro8213
      @noahhadro8213 Рік тому +5

      Well said

    • @GuyInACube
      @GuyInACube  Рік тому +5

      Appreciate the long comment and thougts you put into that. To be clear, the approaches you've done in the past with Power BI and leveraging data from on-prem, still work! That hasn't changed. There are new approaches and things are evolving. If you still want to Import your data or use DirectQuery against something like SQL Server via a gateway, go for it! If you want to leverage OneLake for reuse of data, you can do that now as well. They are different architectures though. It's the question of would you benefit from more of a lakehouse approach, or something different. Fabric gives you a lot of options and you have different tools in your toolbelt.
      A lot of companies already have a data lake type approach where data is already in storage of some kind. This is an easy fit for Fabric. As you mentioned, if you want to take your existing on-prem assets to a lake centric approach, there is some thought and architecture that needs to go into doing that. But again, if what you have is working, that gives you time to think about that approach and if it makes sense from a business perspective. What problem are you trying to solve? Does a lake centric approach with Microsoft Fabric's new capabilities help solve that problem?
      The ability to use the gateway with pipelines, in Fabric, isn't an option yet. Get your feedback in at aka.ms/fabricideas! Lots of folks commenting the desire to have it - and I feel it makes sense.

    • @jeffgray8076
      @jeffgray8076 Рік тому

      Thanks for making this comment - I thought I was taking crazy pills watching that video. Taking data from an existing table, running it through several very brittle engineering steps, only to convert to a new table for use had me wondering what I was missing. Very skeptical of this use case.

    • @dagg497
      @dagg497 8 місяців тому +1

      ​​@@jeffgray8076I agree with you. I would have liked to see Fabric as just repackaged Azure storage gen2+Data Factory+Synapse into RDB database like Azure SQL Database or Serverless.
      This move to Parquet is ONLY beneficial to Databricks imo.
      AND you get stuck with basically having to write a Scala/Python ELT framework just to handle the Delta Lake Ingestion/conversion/time logic..
      Oh and It's real fun querying the data in ssms and having to use an exact Azure url 🤣🤣
      Sidenote i hate Parquet and Databricks already after 8months tackling it
      Not to mention you never knew the past years If Excel/Teams stored data in Onedrive/Sharepoint Onedrive/MS Stream.
      AND now there's Dataverse also, that actjally looks very promising for the power platform with the new power pages to maybe replace sharepoint altogether!

  • @Storm_Troooper
    @Storm_Troooper Рік тому +2

    Yoooo!!! Great content guys. You guys are making my fabric journey easy to navigate with your amazing content. I love the fabric community

  • @yanny2000
    @yanny2000 Рік тому +9

    would love to see a video from you guys showing some examples or best practices how to get data from different sources and databases into Power BI.
    For example: we have a CRM running on mysql, a ERP on firebird and a financial ERP on MSSQL. What is the best strategy to bring data togehter in order to analyse it in Power BI. And where?! (in fabric one lake, in an azure sql Datawarehouse,...)

    • @GuyInACube
      @GuyInACube  Рік тому

      We are already thinking about this. When we think of on-prem data, for lakehouse and warehouse, the only real option right now is Dataflows Gen2. The existing approaches with Power BI and the on-premises data gateway are still valid approaches for reporting on data. That doesn't take advantage of the new workloads though.
      I'll go back to what business problem are you trying to solve though and which tool will help you solve that problem. Going with a lake centric approach with Lakehouse/Warehouse/OneLake should be approached with some architecture in mind and the understanding that orchestration will need to take place regardless of the sources.

  • @kayespedal10
    @kayespedal10 Рік тому +1

    This was my second video I've watched. Wowsers, blown away, all that movement of data and zero code.
    A+

    • @GuyInACube
      @GuyInACube  Рік тому

      BAM! Welcome to the channel. Appreciate you watching!

  • @denyschamberland8897
    @denyschamberland8897 Рік тому +3

    Question: I can easily extract data from a .html financial stocks table using a PySpark notebook w BeautifulSoup crawler, define schema and column headers, create a dataframe and save result in Delta (Stocks.part01,Stocks.part02..,Stocks.part0n) in Lakehouse subfolder under Files. (This could also be achieved using Python .py file as Spark Job Definition w schedule).Still Delta doesn't bring much options other than read (Preview) at this stage. So guess we have not much choice than create a SQL Table in Warehouse using CETAS. Sample shows a single file, but what if we need to... sort of "concatenate" all Delta parts (Stocks.part01,Stocks.part02..,Stocks.part0n) into one table from same Delta location.

  • @clinthofeldt8299
    @clinthofeldt8299 Рік тому +2

    Any word on if we will have the ability to see usage metrics when moving items over from SQL into a Fabric Lakehouse or Warehouse similar to how we currently can utilize the Monitoring features in Azure to see Azure SQL server space and DTU usage when performing read/write operations? If I point a Power BI Report/Dataset to the Lakehouse SQL Endpoint and set an hourly refresh rate on that, it would be helpful to compare that read/write usage to the comparable Azure SQL metrics to see what makes sense to use cost-wise for different scenarios.

  • @gpltaylor
    @gpltaylor 5 місяців тому

    things are getting very excited! Is there are way to stream live SQL Server production data into Fabric?
    We need to report on live data but we don't want to drop and recreate the data sources.. what's the best option for streaming live data into Fabric?

  • @AdrianMcGavock
    @AdrianMcGavock Рік тому +1

    Great video, cool demo from Bob 🥳

    • @GuyInACube
      @GuyInACube  Рік тому

      Much appreciated! Thanks for watching 👊

  • @ravindersunkara5295
    @ravindersunkara5295 Рік тому +1

    Fabric & one data lake concept is cool

    • @GuyInACube
      @GuyInACube  Рік тому

      We agree! Can take things to a whole new level 👊

  • @akthar3
    @akthar3 11 місяців тому

    Am i right in understanding that you can not create shortcuts in a MS Fabric lakehouse to directly access PAAS sql server tables ? So you need complex work-around to make it happen .
    What is the story behind this limitation ? Is there a more straightforward solution on it's way in future?

  • @dagg497
    @dagg497 8 місяців тому +1

    If It aint broke don't fix It. This is just ETL with convoluted extra steps 😅
    I am really starting to miss Blob Storage and Gen 2 storage and a fool prrof SQL Database..
    If the Datalake filesystem was a GUI, fine. But cases I've seen you need a Scala/Python framework to handle Parquet ingestion/conversion. Oh and querying the data needing an Azure URL string is no fun 😭

  • @sonicdoomofficial9200
    @sonicdoomofficial9200 4 місяці тому

    Got a specific question and don’t know how to navigate. I have an Azure Manager SQL Instance in the same Resource Group as my Fabric capacity however I cannot for the life of me get Fabric Data Factory to connect to it. I can connect to the instance remotely SSMS. Any idea what to trouble shoot? Looks networking related somehow.

  • @zorgath
    @zorgath Рік тому

    Is there anyway to do this prior to SQL Server 2022?

  • @Khan-mc3gi
    @Khan-mc3gi Рік тому

    I want to ask that as on one click fabric show dashboard report of power bi same as on one click fabric can show sql report mean as we find total sales,total order,max sales etc same these and other sql query can find in one click in fabric if this happen then it will be great then we dont need to learn sql we can get all sql queries in one click in fabric pls give my question

  • @scottwilliams6741
    @scottwilliams6741 Рік тому +1

    Amazing, info! Thanks, Bob!

  • @robbief1
    @robbief1 Рік тому

    How long until the Azure SQL table can exist as a Fabric table (the lines are becoming blurred).

    • @GuyInACube
      @GuyInACube  Рік тому +1

      Great question! Not sure if that's even in the works. Get your feedback in though at aka.ms/fabricideas if that's something you'd like to see.

  • @mnhworkdev1652
    @mnhworkdev1652 Рік тому

    I wonder what the SQL DB compatibility level needs to be to achieve this.

    • @_HugoQueiroz
      @_HugoQueiroz Рік тому +1

      There's no compatibility level requirement for CETAS.

  • @Karenshow
    @Karenshow Рік тому

    Cold data vs Hot Data ?

    • @dagg497
      @dagg497 8 місяців тому +1

      Hot data is what you typically use. For instance measures and KPIs in Power BI, you jsually compare to "Same Period Last Year". So hot data is usually 2 years.
      Cold data is historic data >2years. Lets say year 3 through year 10.

  • @kc4848_cal
    @kc4848_cal 6 місяців тому

    Now how do I convince my IT team, who controls all this but doesn't use any of these tools, that we need this by the end of the week?

  • @kanishk7267
    @kanishk7267 Рік тому +1

    Amazing content.

  • @OMNS777
    @OMNS777 Рік тому

    Yo ! To both of y’all!!

  • @JasonHurley
    @JasonHurley Рік тому +3

    Important question... are people seriously calling it "The Fabric"?

  • @joshash5944
    @joshash5944 Рік тому

    Just waiting for the perfect world where delta performs as well as, or better, than an indexed relational data warehouse.
    I'm skeptical of Microsoft's use of delta in Fabric Warehouse - and I hope it improves, the performance has been poor. It seems like Synapse Serverless.

  • @TedSeeber
    @TedSeeber Рік тому +5

    "I didn't write a single line of code"- as he shows approximately 60 lines of SQL Scripting.
    I think this no-code thing is not quite as ready for primetime as Microsoft wants us to believe.

    • @reams
      @reams Рік тому

      Right....I was like how did you create the archive tables 😂

    • @bobward436
      @bobward436 Рік тому +2

      @@reams It is a fair comment. What we meant is no code is needed once I had the cold data archived in Azure Storage

    • @dagg497
      @dagg497 8 місяців тому +1

      Agreed. Settkng up external table managing probably isn't static and has to run in a stored procedure. On top of that tou have that sweet sweet SQL that needs to Query an dxact Azure storage URL 😂

  • @szklydm
    @szklydm Рік тому +3

    #IWatchPatrick #TheFabric

    • @GuyInACube
      @GuyInACube  Рік тому

      haha Patrick appreciates that! 🙏👊

  • @Bizza12345
    @Bizza12345 Рік тому

    great video

  • @ItsNotAboutTheCell
    @ItsNotAboutTheCell Рік тому +1

    #QuickCreateEverything!!!

  • @tomek8464
    @tomek8464 Рік тому +1

    I just did a yo😂

  • @vincenzobuono3841
    @vincenzobuono3841 Рік тому

    the same thing that we have already in synapse analystics but now we can use a new word...shortcut....

    • @GuyInACube
      @GuyInACube  Рік тому

      Shortcuts are amazing! This helps you to leverage the next generation of Synapse!

  • @thomasbrothaler4963
    @thomasbrothaler4963 Рік тому

    Geat video, but anyone else feeling uncomfortable due to a lack of naming conventions used for sql tables and columns? 😀