DP-203: 35 - Writing data to ADLSg2 from Azure Databricks

Поділитися
Вставка
  • Опубліковано 1 гру 2024

КОМЕНТАРІ • 12

  • @prabhuraghupathi9131
    @prabhuraghupathi9131 7 місяців тому +1

    Very useful video to know about the managed table and external table, how to create and to store data back to Azure datalake using external table. Thanks Piotr for this great content!!

  • @siddharthchoudhary103
    @siddharthchoudhary103 7 місяців тому +2

    what is the difference between save vs saveAsTable?

    • @TybulOnAzure
      @TybulOnAzure  7 місяців тому +2

      Both will save the data but the latter will also register it as a table in the catalog.

  • @abhijitbaner
    @abhijitbaner 3 місяці тому

    Does DBrx MANAGED tables have any advantages over the EXTERNAL tables? like btter performance on read/write etc?

    • @TybulOnAzure
      @TybulOnAzure  3 місяці тому +1

      I'm not aware of any performance differences between those two (or at least I didn't notice any).
      Important difference between them is that if you drop a managed table then it will also drop your data. In case of external table, only the metadata in metastore is removed but the actual data stays untouched (as it is stored externally).

  • @rabink.5115
    @rabink.5115 4 місяці тому

    Is there any possibility to get access to the databricks notebook you use.

    • @TybulOnAzure
      @TybulOnAzure  4 місяці тому +1

      Yes, check my GitHub - link is in the video description.

  • @MokhtarBoussaada
    @MokhtarBoussaada 3 місяці тому

    When i tried to create an external table, i was forced to create storage credentiel and external location,

    • @TybulOnAzure
      @TybulOnAzure  3 місяці тому

      What was the code you wrote?

    • @MokhtarBoussaada
      @MokhtarBoussaada 3 місяці тому

      I configured aacess on cluster level
      When i directly save files in data lake, usin spark.write.format("delta").save("path") everything was fine
      But when i tried to create the external table ( using the same create table query as you ) i had an error, external location.. when i created storage credentiels and external location then i re execute the cell, it worked

    • @hernanmartindemczuk
      @hernanmartindemczuk 29 днів тому

      Same here. I had to:
      1- Create an Access Connector for Azure Databricks in the Azure portal.
      2- Using the Access Connector's Managed Identity, create a new Databricks Storage Credentials in the Catalog.
      3- Create an External Location pointing to the ADLS path in the storage account, using the Storage Credentials.
      Then I was able to run the command.
      BTW this managed identity needs Storage Blob Data Reader or Storage Blob Data Contributor permissions on the storage account to work.