New Developments in the Open Source Ecosystem: Apache Spark 3 0, Delta Lake, and Koalas

Поділитися
Вставка
  • Опубліковано 7 лис 2024

КОМЕНТАРІ • 21

  • @chinmayabarik557
    @chinmayabarik557 4 роки тому

    Deltalake & koalas going to be the game changer in the field data analytics

  • @rayjohn5163
    @rayjohn5163 3 роки тому +1

    That's an impressive demo!

  • @MPXVM
    @MPXVM 4 роки тому +1

    when you want to scroll down the Jupyter file in video 30:00 and you end up scrolling UA-cam page :))))))

  • @NitinPasumarthy
    @NitinPasumarthy 5 років тому +1

    Where is `plot` function at 17:05 coming from? Does "Apache" Spark natively support displaying dataframes in "Jupyter" notebooks?

  • @swarnalakshmib
    @swarnalakshmib 5 років тому +2

    the demo was really well done.

  • @alaham2590
    @alaham2590 5 років тому

    Very nice. Things will be more exciting using spark with all these new features. One question though, on an existing installation running spark2 ,How easy is the upgrade?

  • @AnirbanNagDev
    @AnirbanNagDev 5 років тому

    Can we get the partition pruning demo or video here in UA-cam?

  • @californiaesnuestra
    @californiaesnuestra 5 років тому +1

    Very good demo, amazing presentation

    • @lackshubalasubramaniam7311
      @lackshubalasubramaniam7311 3 роки тому

      Michael's energy is infectious also he gave a really good overview of the Spark story.

  • @UkrozaVR
    @UkrozaVR 5 років тому +2

    Invite me next time I'll initiate loud applause on right moments, just feel like it's missing on such a presentation :)

  • @NeerajGarg
    @NeerajGarg 4 роки тому

    This is amazing !!

  • @vincenttan6303
    @vincenttan6303 4 роки тому +2

    6:28.. I wonder if the audience clapped because of 2X or being able to save a few lines of codes..

  • @yuanji102
    @yuanji102 5 років тому

    Do I need to install Koalas on every node of cluster or just on the master?

    • @liu3gz
      @liu3gz 5 років тому

      from my experience on spark cluster, you will need to install the package on all nodes. I usually use Redhat Ansible to manage multiple nodes configuration with ease.

  • @mahfuzurrahman4517
    @mahfuzurrahman4517 Рік тому

    great

  • @dreznik
    @dreznik 5 років тому

    nice! when will koalas be available for R?

  • @sahihe9856
    @sahihe9856 5 років тому

    22:57 can someone give some details about that “forecast=true”

    • @Bhaweshkumarsingh
      @Bhaweshkumarsingh 5 років тому

      the plot could be their custom function..
      with the forecast being a param.. just wild guessing from the looks of it

    • @cu7695
      @cu7695 4 роки тому

      🤣😁😁 it looked like simple linear forecast. For business case, you need logarithmic fit dependent on time

  • @bankoftrustnwobot3218
    @bankoftrustnwobot3218 4 роки тому

    Can we drop Python support for Spark?

  • @albertoandreotti7940
    @albertoandreotti7940 5 років тому

    People applauding optimizations that SQL had already introduced 25 years ago. Really kids don't study databases anymore?