Advancing Spark - Delta Sharing

Поділитися
Вставка
  • Опубліковано 1 гру 2024

КОМЕНТАРІ • 30

  • @evogelpohl
    @evogelpohl 2 роки тому +1

    Nice work as always, Sir. It's clear that the bones of the sharing-ecosystem, Delta based, are here. Excited to see UI/UX's over top ala new layered products.

  • @chittillavenkataviswanath1389

    You are truly amazing! Best learning experiences to start the new year.

  • @kuldipjoshi1406
    @kuldipjoshi1406 2 роки тому +4

    Hi, If you can make a detailed video about table access control, hierarchy of how it works in databricks and best practices , That would be great. Awsm video btw.

  • @aqlanable
    @aqlanable 2 роки тому

    Since we are talking about delta sharing, its worth to have a look at alert destinations and alerts in sql persona

  • @gabrielcohensabban4968
    @gabrielcohensabban4968 2 роки тому

    Could you please include a link to the notebook used in this video. Thanks amazing video!!

  • @danhorus
    @danhorus 2 роки тому +1

    23:05 I have the exact same question. If ADLS is in a VNET with no public internet access, I don't suppose Delta Sharing would work because the recipient must be able to query the data directly from ADLS, right? This can be quite a deal breaker for building secure meshes

    • @ArcaLuiNeo
      @ArcaLuiNeo 2 роки тому

      I assume for such scenarios one has to start looking at a self hosted delta sharing server...

  • @drummerboi4eva
    @drummerboi4eva 2 роки тому

    Amazing ! Thanks for making these detailed videos Simon ! Do you know if dynamic data masking for GDPR is possible with delta sharing ?

    • @aqlanable
      @aqlanable 2 роки тому +1

      Its possible with unity catalog, u can mask row level, colum level, data level and in powerbi, it will be masked

    • @aqlanable
      @aqlanable 2 роки тому +1

      Unfortunately, you will have to create views and delta sharing doesn't support dynamics view at current time that we are talking, so mostly you need to go with unity catalog then create dynamic view and provide sql endpoint to the powerbi

  • @AprenderDados
    @AprenderDados Рік тому

    And who processes the data? PowerBI is reading delta?
    Do I need to provide cluster or any computing resource?

    • @AdvancingAnalytics
      @AdvancingAnalytics  Рік тому

      Delta Sharing essentially just returns a payload of keys to access the underlying cloud files - so your client still does the reading/processing etc! The server part of Delta Sharing doesn't currently require any kind of cluster/compute etc

  • @aqlanable
    @aqlanable 2 роки тому

    Delta sharing still not mature to be in enterprise level, however im waiting for post-ga regards to delta sharing and data marketplace provided from databricks

  • @nayan001ujjain
    @nayan001ujjain 2 роки тому

    Hi, Thanks for sharing the knowledge about delta sharing. Can you please explain how costing work in delta sharing and how many hits user can do . Is there any limit? Databricks charging on the basis of IOPS ?

    • @AdvancingAnalytics
      @AdvancingAnalytics  2 роки тому

      Good question - at the moment I've not seen any costs associated! There will be the underlying cost of storage access, data egress etc, but I've not seen a cost model from Databricks yet!

    • @nayan001ujjain
      @nayan001ujjain 2 роки тому

      @@AdvancingAnalytics Thank you 😊

  • @rostislawkrassow7385
    @rostislawkrassow7385 2 роки тому

    Thanks for sharing the review. Could a view also be part of a share?

    • @danhorus
      @danhorus 2 роки тому +1

      The documentation on GitHub mentions support for views. I hope Simon can test it and let us know if there are limitations for views with joins, etc.
      I would also be a little worried about the security aspect of these views, because perhaps the recipient is able to retrieve the underlying SAS Key and access the unmodified table(s) in ADLS instead of a filtered view with row-level security

    • @rostislawkrassow7385
      @rostislawkrassow7385 2 роки тому

      That's exactly the point. A view with row-level security or join inside requires creation of new physical files to share them on file level with SAS tokens.
      Only in case of materialized views (new announced feature) this would work on already persisted set of files.

    • @aqlanable
      @aqlanable 2 роки тому

      Dynamic view/views still in the post-ga, currently only table supported.

    • @rostislawkrassow7385
      @rostislawkrassow7385 2 роки тому

      @@aqlanable thank you for sharing the insight! Curious to see how that will work

  • @rickrofe4382
    @rickrofe4382 2 роки тому

    Thanks for the preview. Do you know if the same integration with Power BI still work in AWS?

    • @AdvancingAnalytics
      @AdvancingAnalytics  2 роки тому +2

      Yep! From the recipient's point of view, the Delta Sharing Server could be in Azure Databricks, AWS, a local web server, anywhere! That's the beauty of it being an open protocol!

    • @rickrofe4382
      @rickrofe4382 2 роки тому

      @@AdvancingAnalytics Super cool!

    • @vinodhkumarganesan6778
      @vinodhkumarganesan6778 2 роки тому

      @@AdvancingAnalytics Hi, Did you see or experience a performance improvement with Power BI running with delta share rather than on SQL warehouse

  • @ddarkings
    @ddarkings Рік тому

    Is there an advantage to setting up delta share for PBI as opposed to linking PBI direct to SQL Endpoint in Databricks as shown in the Partner connect demos. I guess its a way of limiting which tables can be seen in PBI. Are there other benefits as there is more to set up doing the delta share way

    • @AdvancingAnalytics
      @AdvancingAnalytics  Рік тому +1

      Couple of reasons: 1) Delta share doesn't use Databricks compute (aka, it's cheaper) albeit with some limitations, 2) It's primarily focused on users outside of your AD Tenant, who would not be able to connect to your DBX endpoint

  • @akhilannan
    @akhilannan 2 роки тому

    Can you add a view to the share? Or it has to be table?

    • @aqlanable
      @aqlanable 2 роки тому

      Currently only tablr supported, they are working on view in post-ga, u have to wait couple of months

  • @seyma4479
    @seyma4479 2 роки тому

    that would be great if you make a video how to build delta sharing server on our localhosts serving the data from S3 🙂🙂