Explaining what a Lakehouse is!

Поділитися
Вставка
  • Опубліковано 10 лют 2025
  • You've probably heard the term lakehouse with various services like Azure Synapse Analytics. But what actually is a lakehouse? And why is it different from a data warehouse?
    📢 Become a member: guyinacu.be/me...
    *******************
    Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
    🎓 Guy in a Cube courses: guyinacu.be/co...
    *******************
    LET'S CONNECT!
    *******************
    -- / guyinacube
    -- / awsaxton
    -- / patrickdba
    -- / guyinacube
    -- / guyinacube
    -- guyinacube.com
    **Gear**
    🛠 Check out my Tools page - guyinacube.com...
    #PowerBI #GuyInACube

КОМЕНТАРІ • 39

  • @esteban-alvino
    @esteban-alvino 10 місяців тому +1

    It's a very nice talk with a lot of energy. That is contagion; it draws a smile from me. Thanks guys

  • @brianengelbrechtandersen9435
    @brianengelbrechtandersen9435 2 роки тому +6

    We have been using Data Lakehouse mostly with sql serverless for around a year now and it works very well for our customers.. We have to think differently when designing the ETL/ELT but basically we still end up with a snowflake schema and great performance. However, it requires a tool like Power Bi premium for caching data bc of the rather slow and unpredictiable latency from the serverless.

    • @dagg497
      @dagg497 10 місяців тому

      Yup we lost the fast data retention of an SQL server and instead have to deal with parquet file conversions..

  • @osPA78
    @osPA78 2 роки тому +1

    We are looking to implement a DW, DL, and/or DLH where I work so I am looking EXTREMELY FORWARD to those videos. Thank you!!!

  • @simon2093
    @simon2093 2 роки тому +2

    Yep, now building out a company wide lakehouse - early days at the moment. But this thing is nuts fantastic. Especially loving delta tables.

    • @dagg497
      @dagg497 10 місяців тому +1

      I hate Delta parquet.. 😂
      Convert files upon stahing, build delta logic and shit.. Then everything sucks when handling upserts and deletion.

  • @lausmaja
    @lausmaja Рік тому

    Great overview guys. Love it!

  • @shilpapathi
    @shilpapathi 2 роки тому +1

    Great content and engaging, worth every minute.

  • @beverlywesterkamm-wallrauc5808
    @beverlywesterkamm-wallrauc5808 6 місяців тому

    This is exactly what we use. It is great for ad hoc analysis as well.

  • @juliangimbel1583
    @juliangimbel1583 2 роки тому +2

    Great video, but using the Azure machine learning icon for the data lake is confusing for me 😅

  • @definitelynorandomvideos24
    @definitelynorandomvideos24 2 роки тому

    Great Video, at my company we are currently trying to identify which architecture is the best to unify the many data silos we have. Looks like I can show a possible solution ;)

  • @WilliamRockseo
    @WilliamRockseo Рік тому

    Amazing information, Thank You

  • @mwaltercpa
    @mwaltercpa 2 роки тому

    Chris Wagner is pointing us toward Synapse Analytics, I got a book and would love to also watch GIAC videos!

  • @kurrysamir
    @kurrysamir 2 роки тому

    Great content as always! Awaiting the slowed down version with some hands!!

  • @Cero_GT
    @Cero_GT 2 роки тому

    Neat. We’ve been taking this approach for a couple years now but didn’t give it a fancy name.

    • @dagg497
      @dagg497 10 місяців тому +1

      Yeah Fabric is nothing new to me...
      Data factory and Synapse existed 3years ago..
      Sem-structured data storage alternatives as well and Apache Spark and Cosmos DB

  • @premcst
    @premcst 2 роки тому

    Waiting for the slow downed version Patrick !!

  • @minathanh7072
    @minathanh7072 2 роки тому

    Great video, thank you!!!

  • @7anishok390
    @7anishok390 2 роки тому

    Hi Please help me. I have created an external table in the synapse lake database, Now I like to load the records from the external table into the dedicated SQL pool table. Please advice on the procedure.

  • @kvelez
    @kvelez 3 дні тому

    Thanks.

  • @SetYourBarTo10
    @SetYourBarTo10 2 роки тому +2

    I wonder if the increase in remote work impacted naming conventions? …data lakes…data streams…lake house…I am started to feel like someone has a better view while they work.

  • @spilledgraphics
    @spilledgraphics 2 роки тому +1

    where does the concept of datamarts (not Power BI datamarts) fit into this combo of data lakehouse?

  • @matheww9944
    @matheww9944 2 роки тому

    Was that you Patrick on the "Power BI Update - October 2022" ? :)

  • @johnnyw5627
    @johnnyw5627 2 роки тому +3

    When the Data Warehouse and the Data Lake went out for a few beers... was it in Belgium?

  • @surfh3r0
    @surfh3r0 Рік тому

    interesting! thanks!

  • @Al-vl3tp
    @Al-vl3tp 2 роки тому

    Yes, we are building a data lake house with the structured data transformed pointing to redshift and all of the unstructured data pointing to the S3 buckets. But the data scientists can't query the data in Athena and point the data to Power BI for analytics and data visualizations.

  • @MichaelGreen831
    @MichaelGreen831 11 місяців тому

    I would have loved to hear the problem DW has that DL and DLH solves. I only heard “large” data in contrast to DW historical data.

    • @dagg497
      @dagg497 10 місяців тому

      Personally i don't like DLH all that much due to the slow loading of new data into parquet and the fact you read files instead of SQL tables with history.
      Who would've known merging 10'000parquet files would be slower than an RDB database?

  • @Expateer
    @Expateer 10 місяців тому

    Q: What happens to all those BI reports with their virtual schemas when the "Silver Layer" data structure changes?

  • @rohansrivastwa827
    @rohansrivastwa827 2 роки тому

    Can your audio be more clear with less base and noise?

  • @kevinruckstuhl2378
    @kevinruckstuhl2378 2 роки тому +8

    More like data swamp

    • @dagg497
      @dagg497 10 місяців тому +1

      Yeah. I liked Storage Blobs, Data Factory into a Synappse or Azure SQL Database.
      Lakehouse and parquet is very dependant on a coded scala/python framework and makes every step 10x more confusing and code dependant.
      We are back to SSIS 2005 and have to code an ETL framework in C+ or instead of working in BI moving into fullstack coding instead..

  • @brianxyz
    @brianxyz 2 роки тому

    What is a Databrick? TIA

    • @benhalicki9749
      @benhalicki9749 2 роки тому

      Microsoft implementation of Apache Spark. Connects to the data lake etc but for in memory, clustered computing. Geared towards big data analytics.

  • @nourlchann
    @nourlchann Рік тому

    Patrick really likes data LMAOOO

  • @Milhouse77BS
    @Milhouse77BS 2 роки тому

    But... what about...a...: "Lake Database" :)

  • @martiruda
    @martiruda Рік тому

    funny this is called data constelation, someone added lakehouse just to sell training courses I bet.

  • @1yyymmmddd
    @1yyymmmddd 2 роки тому

    Data warehouses are dead, aren't they?