How does Ray compare to Apache Spark??

Поділитися
Вставка
  • Опубліковано 8 січ 2025

КОМЕНТАРІ • 14

  • @andrewcampbell7011
    @andrewcampbell7011 3 роки тому +7

    Thank you. This is by the far the best overview of the distinction between these platforms. I'm a long time Spark user and Ray newbie and this break down rings true. I really like Ray for hyperparameter tuning and model serving.

  • @golagaz
    @golagaz 3 роки тому +1

    Thank you Jonathan. Nice presentation.

  • @christianrakotondrainibe6256
    @christianrakotondrainibe6256 3 роки тому +2

    Thank you very much for your videos, it's awesome ! Do you also know how to setup a Ray cluster on two different machines please ?

    • @jonathanuniversity
      @jonathanuniversity  3 роки тому +1

      Ray has a pretty nice library called Ray Cluster that manages all the coordination needed to setup the library on multiple machines (or deploy to the cloud). docs.ray.io/en/master/cluster/index.html

  • @SigmundVestergaard
    @SigmundVestergaard 2 роки тому +3

    Thank you for this video. I was wondering: were Ray Datasets released after you made this video? It must be the case, because otherwise I'm sure you'd have mentioned them.

    • @jonathanuniversity
      @jonathanuniversity  2 роки тому +1

      yeah at the time of the video the only official Ray modules were the ones from the diagram and the ecosystem has actually changed pretty dramatically in the time since

  • @JavArButt
    @JavArButt Рік тому +1

    Very nice, thank you for that

  • @tripathi26
    @tripathi26 2 роки тому

    Very Informative. Thanks for sharing your knowledge.
    I wonder why have you stopped uploading new videos!

    • @jonathanuniversity
      @jonathanuniversity  2 роки тому

      got distracted with other work for the past year but I am actually recording a new video now 🤗 and will try to actually be consistent going forward

    • @tripathi26
      @tripathi26 2 роки тому +1

      @@jonathanuniversity ❤️ glad hearing that. Eagerly waiting 😊

  • @wilsonnybinghamton
    @wilsonnybinghamton 2 роки тому

    Hi Jonathan, I am wondering will serverless replace flink and spark in the future? I am thinking if beam can be a thing that a serverless platform can use to replace flink and spark.

    • @jonathanuniversity
      @jonathanuniversity  2 роки тому +1

      I see them as somewhat complementary. Serverless is really an infrastructure concept where Spark/Flink/Ray/etc. are more programming models (kind of analogous to the distinction of user space vs kernel space for personal computers).
      So you can in theory have a serverless deployment of Flink/Spark/etc. and is exactly what product like AWS Serverless EMR provide.

    • @wilsonnybinghamton
      @wilsonnybinghamton 2 роки тому

      @@jonathanuniversity Thanks for the reply! Btw, if i have beam running on top of a Faas framework which provides alternative runners other than Spark/Flink, does this make sense? Is it an alternative and direct competitor with them?

    • @jonathanuniversity
      @jonathanuniversity  2 роки тому +1

      @@wilsonnybinghamton without knowing the exact FaaS platform internals it is hard to say for sure, but for if you are writing/running Beam it shouldn't really matter what the underlying runners are.