Advancing Spark - Provisioning Databricks Users through SCIM
Вставка
- Опубліковано 5 жов 2024
- One of the biggest pains in managing Databricks has been user management, and with people starting to split their work across several workspaces this becomes even trickier to manage. We've previously built out integrations through the API, automatically pushing hard-coded lists of users into Databricks... but there has to be a better way... and there is!
In this video Simon walks through the SCIM connector, an enterprise application we can create inside Azure Active Directory that automatically provisions users and groups within a destination Databricks workspace. This is also used when provisioning the Databricks Account Console, which is a vital part of Unity Catalog, so it's worth getting things set up in advance!
For more info on Databricks SCIM - see the docs here: docs.microsoft...
As always, if Advancing Analytics can help you on your lakehouse journey, get in touch!
Great video! Just one thing, though, post Unity catalog setup, provisioning must be done through Databricks admin account rather than at workspace level. Hence, token generation is done at account level. After provisioning is enabled, AD users/groups get synced in as Databricks account users/groups which can then be further assigned workspace level access by workspace admins or account admins.
Yeah, this vid was before identity federation went live! Probably another vid I need to do!
i haven't had a chance to use SCIM as most of my work involves working with an environment build by devops professionals but it sounds like a great way to sync AAD users with Databricks.
Hi Simon!
Great video again!
What a fantastic feature. Finally it’s here. Can you provide details in the SCIM connector what permission a person or group should have in Databricks. (E.g. only Databricks SQL)?
Loved it.thanks for the video
Thanks for sharing
Anybody else experiencing that SPN’s inside an SCIM synced Azure AD group are not provisioned to the Databricks workspace? Also I expect the same issue for managed identities.
That is unfortunately correct. The AAD Enterprise App doesn't SCIM over SPs or MIs. :( You can use the Databricks Terraform Provider to do this though.
Thank you Simon for the great video! love the scim pun :P
One question, if the list of users and groups are not known upfront and are created later in Azure AD, how can they be added to the SCIM connector afterwards? Is the SCIM api meant for that?
Hi! You can amend the users & groups in the SCIM connector at any time, there are settings as to how often it will sync with the destination. Also, it's worth looking at the Identity Federation news from this month, as this is an alternative approach for managing databricks users in Azure! docs.microsoft.com/en-gb/azure/databricks/administration-guide/users-groups/#enable-identity-federation
Can you have multiple databricks workspaces? Or would you go about doing this for multiple workspaces?
Hi Simon, 1 question. if I have to implement Fine-grained access control at databricks at unity catalog level where we rely on databricks groups for access control, can it be achieved using Azure AD group synced with Databricks using SCIM provisioning?
OR do I have to create databricks groups separately for this access control?
Any inkling of when this might go GA? I’ve had my eye on it for a while but I’m reluctant to use it whilst it’s still in Public Preview.
No idea! Although I expect a raft of announcements next week, not sure if SCIM is one of them!
I seriously can’t wait! I’m so fed up of manually maintaining my users and groups across dev/sit/pre-prod/prod - it’s soooo tedious! 🤣
Hi Simon, would you mind showing how to do this using AWS?
What if I have more then one workspace. Do I need to add all in provisioning? If I did how will my user and group have access? for eg : I have added 5 workspaces in provisioning which is used by different work groups. But we have created a single enterprise application that will have a common user/group option to add. If I add a user, will that user will have access to all workspaces?
Using this, can we sync users in an Azure AD group over to a Databricks Group?
Yep, that's exactly what this is for!