Microsoft Fabric Spark Utilities - mssparkutils

Поділитися
Вставка
  • Опубліковано 6 лис 2023
  • Welcome back to another episode of Fabric Espresso DE&DS series. In this episode, we introduce and explore the nuances of Fabric Notebooks and a library named MSSparkUtils. Microsoft Spark Utilities (MSSparkUtils) is a built-in package to help you easily perform common tasks. You can use MSSparkUtils to work with file systems, to get environment variables, to chain notebooks together, and to work with secrets. MSSparkUtils are available in PySpark (Python), Scala, .NET Spark (C#), and R (Preview) notebooks and Synapse pipelines.
    🎙 Meet the Speakers:
    👤 Guest from Microsoft Fabric Product Group: Jene Zhang, Senior Program Manager of Fabric Notebooks
    Jene is a Senior Program Manager at Microsoft with a passion for developing advanced developer tools and AI technology. Currently, she is spearheading the creation of a specialized Spark notebook aimed at enabling data engineers and scientists to excel in their domains.
    LinkedIn: / edelweissno1
    Twitter: / edelweissno1
    👤 Host: Estera Kot, Senior Product Manager at Microsoft and a member of the Fabric Product Group. She holds the role of Product Owner for Apache Spark-based runtimes in Microsoft Fabric and Synapse Analytics. Estera is a Data & AI Architect and is passionate about computer science.
    LinkedIn: / esterakot
    Twitter: / estera_kot
    👍 Like this video? Don't forget to hit the 'Like' button and share it with your network!
    🔔 Stay Updated: For more insights into Microsoft Fabric Data Engineering and Data Science, and all things tech, make sure to subscribe to our channel and hit the notification bell so you never miss an episode!
    #Microsoft #MicrosoftFabric #FabricNotebooks #DataScience #BigData #DataAnalytics #ApacheSpark #JupyterNotebook #SynapseNotebook #IPywidgets #TechInnovation #mssparkutils #MicrosoftSparkUtilities

КОМЕНТАРІ • 9

  • @luizfernandosp
    @luizfernandosp 3 місяці тому

    Shared, really good and helped me with some doubts that I had related to .exit and .run ! thank you so much

  • @NeumsFor9
    @NeumsFor9 8 місяців тому +2

    Oftentimes, many databricks notebooks can be refactored to use mssparkutils instead of dbutils on synapse with extremely little effort. The fact is that both can be used within an org especially with UC model now.

  • @HundahlNiels
    @HundahlNiels 8 місяців тому +2

    Great demo 👍🏻 Can you share your notebook, Jene?

  • @DanielWeikert
    @DanielWeikert 8 місяців тому +2

    do you guys provide the notebook files? Thx

  • @georgehu8652
    @georgehu8652 5 місяців тому

    very good

  • @NeumsFor9
    @NeumsFor9 8 місяців тому +2

    Is there a GUI/DAG representation of statuses of running mutliple child notebooks in parallel?
    For example, in one gui view can I see that child NB 1 is 70% complete, child NB 2 is 60% complete, etc?
    I know you could parse out metadata to do that, but I was wondering if perhaps a MS extension to a Spark UI or something equivalent was available.
    Of course, ADF could do something similar if the notebook activities are synchronous, but I was wondering if there wa something else available?

    • @JeneZhang
      @JeneZhang 8 місяців тому

      That's a good suggestion! I like the idea, we'll see what we can do here😊

  • @DanielWeikert
    @DanielWeikert 8 місяців тому

    How do i get the notebook path? Try to run second notebook with mssparkutils but it throws errors related to the path of the called notebook.
    Don't find anythin in the doc.
    There must be some easy way right? Can you help? Thx

    • @robinlin3884
      @robinlin3884 8 місяців тому

      Can you share more detail about issue?