Getting data into your Microsoft Fabric Lakehouse using Load to Tables

Поділитися
Вставка
  • Опубліковано 18 вер 2023
  • The future of data analytics is here, and it's called Lakehouse! Microsoft Fabric Lakehouse is revolutionizing the way we manage and analyze data.
    📌 In this episode, we'll explore:
    📊 Load Single File into a New or Existing Table
    📊 Learn which file types are supported in the Load to Tables feature.
    📊 Table and Column Name Validation and Rules
    📊 Walk through the process of selecting a file in the Lakehouse Files section and loading it into a new Delta table.
    🎙 Meet the Speakers:
    👤 Guest from Microsoft Fabric Product Group: Daniel Coelho, Principal Product Manager
    Linkedin: / danibunny
    👤 Host: Estera Kot: Senior Product Manager at Microsoft
    LinkedIn: / esterakot
    Twitter: / estera_kot
    #microsoft #microsoftfabric

КОМЕНТАРІ • 10

  • @knuckleheadmcspazatron4939
    @knuckleheadmcspazatron4939 2 місяці тому

    This is really awesome! For some files this is a great method. Use it when it works kinda thing.

  • @lukaszk4388
    @lukaszk4388 9 місяців тому +1

    hi, thanks for video. It was very widely explained for use cases and possibilities. Especially loading from folder comes in handy.
    There is one thing that I dont understand : why in case we dont see preview correctly should we drop and load tables again. How can you be sure that it will not repeat? I would rather see the reasons why table didnt not load correctly to understand where is the problem. What do you think?

  • @TomFrost33
    @TomFrost33 2 місяці тому

    Ther are many video options about loading data into a Lakehouse. How do we manage\edit the data once it is in there?

  • @XEQUTE
    @XEQUTE 3 місяці тому

    that automation thing can be very handy

  • @sanishthomas2858
    @sanishthomas2858 2 місяці тому

    Nice. if I save the files from source into the Lakehouse File as csv and Json then will it save it has delta parquet if not then why we are saying data is saved in one lake as delta parquet

  • @billkuhn5155
    @billkuhn5155 9 місяців тому +1

    Very helpful video.
    Does/will load to table support incremental load from lakehouse files using merge?
    i.e., if lakehouse files that contains inserts, updates, and deletes is copied into lakehouse files each file needs to be merged (in chronological order) into the lakehouse table so that the correct final state is attained.
    Also, is there a way to retain history in the lakehouse table with the ability to time travel (a popular feature of other table offerings like iceberg).
    Thanks in advance for any pointers/suggestions.

    • @GuillaumeBerthier
      @GuillaumeBerthier 8 місяців тому

      Agree I would like to see incremental load capability with upsert/delete/merge operations; if I understood correctly it’s currently doing append only 😮

  • @ricardoabella867
    @ricardoabella867 8 місяців тому

    You import the CSV but from a folder in the computer, but where is the connection to the file that originated the CSV? I see that the CSV in Fabric is static is not being updated.

  • @DanielWillen
    @DanielWillen 10 місяців тому

    We have an older AX09 database that is read only. It has about 1000 tables. There's absolutely no easy way to copy those tables into a Lakehouse , even with pipelines. For one, the copy tool doesn't support schema. So dbo.inventtrans becomes dbo.dbo_inventtrans in the target. Furthermore you basically have to export one table at a time, because when selecting multiple, schema mappings are not being generated. Then you add to that the strict case sensitive queries. From Azure SQL to Azure Serverless to Fabric Warehouse in just a span of 4 years. It's too much to ask companies that have lots of data and integrations going on to make the switch every time.

    • @SciCoBunny
      @SciCoBunny 9 місяців тому

      Hi Daniel! Its Daniel here! 🙂
      You scenario may align better using SQL mounting technologies or ingesting directly to the SQL DW layer. If the later, you data will show up as Delta directly. In both cases, there is more tooling around att. There is a lot coming in on migration and I'm sure your scenario is covered.
      I'd also say that data copy to Lakehouse as Delta would be a "write once, run many" script to be coded in PySpark (Notebooks or SJDs) and operationalized using either Pipelines or direct scheduling capabilities of Fabric.
      Nevertheless, I'm forwarding your feedback to the Data Integration teams.
      Thanks!