Ask Databricks - About Unity Catalog with Paul Roome

Поділитися
Вставка
  • Опубліковано 12 вер 2024

КОМЕНТАРІ • 5

  • @Markttt5
    @Markttt5 11 місяців тому

    These Ask Databricks sessions are great Simon. Content/subjects/questions being covered are spot on. Thanks for all your efforts.

  • @allthingsdata
    @allthingsdata 11 місяців тому +1

    The main design flaw I see in UC is that of external vs managed tables. If I create a schema without an external storage pointer/location, tables will be created in default UC location as managed. If I create a schema with an external storage location and then create a table under it (without explicitly giving a location), I would assume that the table is created in the directory of the schema. But UC creates a managed table with subdirectories with random ids, e.g. `schema_dir/__unitystorage/schemas//tables//`. This makes it impossible to know the location deterministically and other tools that want to access the table need to get the "managed" location. In Hive they were also created as managed but in the schema directory. I can only speculate as to the reasons that Databricks introduces these random structures but conceptually it doesn't make sense to me. I would expect all tables created under a schema with storage location to be created as external tables under that very directory that I have already provided in the schema. They constantly talk about the openness but if every tool needs to go through UC to get the location of a table, it's not really open. And yes I know that you can always simply provide the location during every table creation but that is not (business) user-friendly, safe or intuitive. I wonder what @AdvancingAnalytics thinks about this. How would you use external tables and avoid always having to provide the location? Thanks.

    • @allthingsdata
      @allthingsdata 8 місяців тому

      One reason for the managed location having random IDs is the soft delete that allows UNDROP feature.

  •  11 місяців тому

    Simon, please generate subtitles :) Thank you!

    • @AdvancingAnalytics
      @AdvancingAnalytics  11 місяців тому +1

      They're generated now! Always takes a little while you UA-cam to catch up :)