Thank you. This is by the far the best overview of the distinction between these platforms. I'm a long time Spark user and Ray newbie and this break down rings true. I really like Ray for hyperparameter tuning and model serving.
Ray has a pretty nice library called Ray Cluster that manages all the coordination needed to setup the library on multiple machines (or deploy to the cloud). docs.ray.io/en/master/cluster/index.html
Thank you for this video. I was wondering: were Ray Datasets released after you made this video? It must be the case, because otherwise I'm sure you'd have mentioned them.
yeah at the time of the video the only official Ray modules were the ones from the diagram and the ecosystem has actually changed pretty dramatically in the time since
Hi Jonathan, I am wondering will serverless replace flink and spark in the future? I am thinking if beam can be a thing that a serverless platform can use to replace flink and spark.
I see them as somewhat complementary. Serverless is really an infrastructure concept where Spark/Flink/Ray/etc. are more programming models (kind of analogous to the distinction of user space vs kernel space for personal computers). So you can in theory have a serverless deployment of Flink/Spark/etc. and is exactly what product like AWS Serverless EMR provide.
@@jonathanuniversity Thanks for the reply! Btw, if i have beam running on top of a Faas framework which provides alternative runners other than Spark/Flink, does this make sense? Is it an alternative and direct competitor with them?
@@wilsonnybinghamton without knowing the exact FaaS platform internals it is hard to say for sure, but for if you are writing/running Beam it shouldn't really matter what the underlying runners are.
Thank you. This is by the far the best overview of the distinction between these platforms. I'm a long time Spark user and Ray newbie and this break down rings true. I really like Ray for hyperparameter tuning and model serving.
Thank you Jonathan. Nice presentation.
Thank you very much for your videos, it's awesome ! Do you also know how to setup a Ray cluster on two different machines please ?
Ray has a pretty nice library called Ray Cluster that manages all the coordination needed to setup the library on multiple machines (or deploy to the cloud). docs.ray.io/en/master/cluster/index.html
Thank you for this video. I was wondering: were Ray Datasets released after you made this video? It must be the case, because otherwise I'm sure you'd have mentioned them.
yeah at the time of the video the only official Ray modules were the ones from the diagram and the ecosystem has actually changed pretty dramatically in the time since
Very nice, thank you for that
Very Informative. Thanks for sharing your knowledge.
I wonder why have you stopped uploading new videos!
got distracted with other work for the past year but I am actually recording a new video now 🤗 and will try to actually be consistent going forward
@@jonathanuniversity ❤️ glad hearing that. Eagerly waiting 😊
Hi Jonathan, I am wondering will serverless replace flink and spark in the future? I am thinking if beam can be a thing that a serverless platform can use to replace flink and spark.
I see them as somewhat complementary. Serverless is really an infrastructure concept where Spark/Flink/Ray/etc. are more programming models (kind of analogous to the distinction of user space vs kernel space for personal computers).
So you can in theory have a serverless deployment of Flink/Spark/etc. and is exactly what product like AWS Serverless EMR provide.
@@jonathanuniversity Thanks for the reply! Btw, if i have beam running on top of a Faas framework which provides alternative runners other than Spark/Flink, does this make sense? Is it an alternative and direct competitor with them?
@@wilsonnybinghamton without knowing the exact FaaS platform internals it is hard to say for sure, but for if you are writing/running Beam it shouldn't really matter what the underlying runners are.