Explaining what a Lakehouse is!

Поділитися
Вставка
  • Опубліковано 31 жов 2022
  • You've probably heard the term lakehouse with various services like Azure Synapse Analytics. But what actually is a lakehouse? And why is it different from a data warehouse?
    📢 Become a member: guyinacu.be/membership
    *******************
    Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
    🎓 Guy in a Cube courses: guyinacu.be/courses
    *******************
    LET'S CONNECT!
    *******************
    -- / guyinacube
    -- / awsaxton
    -- / patrickdba
    -- / guyinacube
    -- / guyinacube
    -- guyinacube.com
    **Gear**
    🛠 Check out my Tools page - guyinacube.com/tools/
    #PowerBI #GuyInACube
  • Наука та технологія

КОМЕНТАРІ • 37

  • @esteban-alvino
    @esteban-alvino 3 місяці тому

    It's a very nice talk with a lot of energy. That is contagion; it draws a smile from me. Thanks guys

  • @osPA78
    @osPA78 Рік тому +1

    We are looking to implement a DW, DL, and/or DLH where I work so I am looking EXTREMELY FORWARD to those videos. Thank you!!!

  • @lausmaja
    @lausmaja 7 місяців тому

    Great overview guys. Love it!

  • @kurrysamir
    @kurrysamir Рік тому

    Great content as always! Awaiting the slowed down version with some hands!!

  • @definitelynorandomvideos24
    @definitelynorandomvideos24 Рік тому

    Great Video, at my company we are currently trying to identify which architecture is the best to unify the many data silos we have. Looks like I can show a possible solution ;)

  • @brianengelbrechtandersen9435
    @brianengelbrechtandersen9435 Рік тому +6

    We have been using Data Lakehouse mostly with sql serverless for around a year now and it works very well for our customers.. We have to think differently when designing the ETL/ELT but basically we still end up with a snowflake schema and great performance. However, it requires a tool like Power Bi premium for caching data bc of the rather slow and unpredictiable latency from the serverless.

    • @dagg497
      @dagg497 3 місяці тому

      Yup we lost the fast data retention of an SQL server and instead have to deal with parquet file conversions..

  • @shilpapathi
    @shilpapathi Рік тому

    Great content and engaging, worth every minute.

  • @minathanh7072
    @minathanh7072 Рік тому

    Great video, thank you!!!

  • @WilliamRockseo
    @WilliamRockseo 8 місяців тому

    Amazing information, Thank You

  • @premcst
    @premcst Рік тому

    Waiting for the slow downed version Patrick !!

  • @juliangimbel1583
    @juliangimbel1583 Рік тому +2

    Great video, but using the Azure machine learning icon for the data lake is confusing for me 😅

  • @simon2093
    @simon2093 Рік тому +2

    Yep, now building out a company wide lakehouse - early days at the moment. But this thing is nuts fantastic. Especially loving delta tables.

    • @dagg497
      @dagg497 3 місяці тому +1

      I hate Delta parquet.. 😂
      Convert files upon stahing, build delta logic and shit.. Then everything sucks when handling upserts and deletion.

  • @7anishok390
    @7anishok390 Рік тому

    Hi Please help me. I have created an external table in the synapse lake database, Now I like to load the records from the external table into the dedicated SQL pool table. Please advice on the procedure.

  • @mwaltercpa
    @mwaltercpa Рік тому

    Chris Wagner is pointing us toward Synapse Analytics, I got a book and would love to also watch GIAC videos!

  • @surfh3r0
    @surfh3r0 9 місяців тому

    interesting! thanks!

  • @Cero_GT
    @Cero_GT Рік тому

    Neat. We’ve been taking this approach for a couple years now but didn’t give it a fancy name.

    • @dagg497
      @dagg497 3 місяці тому +1

      Yeah Fabric is nothing new to me...
      Data factory and Synapse existed 3years ago..
      Sem-structured data storage alternatives as well and Apache Spark and Cosmos DB

  • @spilledgraphics
    @spilledgraphics Рік тому +1

    where does the concept of datamarts (not Power BI datamarts) fit into this combo of data lakehouse?

  • @SetYourBarTo10
    @SetYourBarTo10 Рік тому +2

    I wonder if the increase in remote work impacted naming conventions? …data lakes…data streams…lake house…I am started to feel like someone has a better view while they work.

  • @matheww9944
    @matheww9944 Рік тому

    Was that you Patrick on the "Power BI Update - October 2022" ? :)

  • @MichaelGreen831
    @MichaelGreen831 3 місяці тому

    I would have loved to hear the problem DW has that DL and DLH solves. I only heard “large” data in contrast to DW historical data.

    • @dagg497
      @dagg497 3 місяці тому

      Personally i don't like DLH all that much due to the slow loading of new data into parquet and the fact you read files instead of SQL tables with history.
      Who would've known merging 10'000parquet files would be slower than an RDB database?

  • @johnnyw5627
    @johnnyw5627 Рік тому +3

    When the Data Warehouse and the Data Lake went out for a few beers... was it in Belgium?

  • @rohansrivastwa827
    @rohansrivastwa827 Рік тому

    Can your audio be more clear with less base and noise?

  • @Al-vl3tp
    @Al-vl3tp Рік тому

    Yes, we are building a data lake house with the structured data transformed pointing to redshift and all of the unstructured data pointing to the S3 buckets. But the data scientists can't query the data in Athena and point the data to Power BI for analytics and data visualizations.

  • @Expateer
    @Expateer 2 місяці тому

    Q: What happens to all those BI reports with their virtual schemas when the "Silver Layer" data structure changes?

  • @nourlchann
    @nourlchann 4 місяці тому

    Patrick really likes data LMAOOO

  • @brianxyz
    @brianxyz Рік тому

    What is a Databrick? TIA

    • @benhalicki9749
      @benhalicki9749 Рік тому

      Microsoft implementation of Apache Spark. Connects to the data lake etc but for in memory, clustered computing. Geared towards big data analytics.

  • @kevinruckstuhl2378
    @kevinruckstuhl2378 Рік тому +8

    More like data swamp

    • @dagg497
      @dagg497 3 місяці тому +1

      Yeah. I liked Storage Blobs, Data Factory into a Synappse or Azure SQL Database.
      Lakehouse and parquet is very dependant on a coded scala/python framework and makes every step 10x more confusing and code dependant.
      We are back to SSIS 2005 and have to code an ETL framework in C+ or instead of working in BI moving into fullstack coding instead..

  • @Milhouse77BS
    @Milhouse77BS Рік тому

    But... what about...a...: "Lake Database" :)

  • @martiruda
    @martiruda Рік тому

    funny this is called data constelation, someone added lakehouse just to sell training courses I bet.

  • @1yyymmmddd
    @1yyymmmddd Рік тому

    Data warehouses are dead, aren't they?