Microsoft Fabric Spark Utilities - mssparkutils
Вставка
- Опубліковано 6 лис 2023
- Welcome back to another episode of Fabric Espresso DE&DS series. In this episode, we introduce and explore the nuances of Fabric Notebooks and a library named MSSparkUtils. Microsoft Spark Utilities (MSSparkUtils) is a built-in package to help you easily perform common tasks. You can use MSSparkUtils to work with file systems, to get environment variables, to chain notebooks together, and to work with secrets. MSSparkUtils are available in PySpark (Python), Scala, .NET Spark (C#), and R (Preview) notebooks and Synapse pipelines.
🎙 Meet the Speakers:
👤 Guest from Microsoft Fabric Product Group: Jene Zhang, Senior Program Manager of Fabric Notebooks
Jene is a Senior Program Manager at Microsoft with a passion for developing advanced developer tools and AI technology. Currently, she is spearheading the creation of a specialized Spark notebook aimed at enabling data engineers and scientists to excel in their domains.
LinkedIn: / edelweissno1
Twitter: / edelweissno1
👤 Host: Estera Kot, Senior Product Manager at Microsoft and a member of the Fabric Product Group. She holds the role of Product Owner for Apache Spark-based runtimes in Microsoft Fabric and Synapse Analytics. Estera is a Data & AI Architect and is passionate about computer science.
LinkedIn: / esterakot
Twitter: / estera_kot
👍 Like this video? Don't forget to hit the 'Like' button and share it with your network!
🔔 Stay Updated: For more insights into Microsoft Fabric Data Engineering and Data Science, and all things tech, make sure to subscribe to our channel and hit the notification bell so you never miss an episode!
#Microsoft #MicrosoftFabric #FabricNotebooks #DataScience #BigData #DataAnalytics #ApacheSpark #JupyterNotebook #SynapseNotebook #IPywidgets #TechInnovation #mssparkutils #MicrosoftSparkUtilities
Shared, really good and helped me with some doubts that I had related to .exit and .run ! thank you so much
Oftentimes, many databricks notebooks can be refactored to use mssparkutils instead of dbutils on synapse with extremely little effort. The fact is that both can be used within an org especially with UC model now.
Great demo 👍🏻 Can you share your notebook, Jene?
do you guys provide the notebook files? Thx
very good
Is there a GUI/DAG representation of statuses of running mutliple child notebooks in parallel?
For example, in one gui view can I see that child NB 1 is 70% complete, child NB 2 is 60% complete, etc?
I know you could parse out metadata to do that, but I was wondering if perhaps a MS extension to a Spark UI or something equivalent was available.
Of course, ADF could do something similar if the notebook activities are synchronous, but I was wondering if there wa something else available?
That's a good suggestion! I like the idea, we'll see what we can do here😊
How do i get the notebook path? Try to run second notebook with mssparkutils but it throws errors related to the path of the called notebook.
Don't find anythin in the doc.
There must be some easy way right? Can you help? Thx
Can you share more detail about issue?