I have completed this whole playlist, full of knowledge and it gave me enough confidence to handle questions..... I saw your bucketing video and it was awesome then i decided to complete this whole playlist and here I am at the end. I really learnt alot. please make more videos as you are our senior DE.😅
Hey @coolraviraj24, I'm elated to know this and super glad that the playlist has helped strengthen your concepts. Appreciate you putting out this note here. Please also share it w/ your friends and colleagues :)
Those words mean a lot, thank you @gopinathdhanasekar328! If you wouldn't mind, a request to kindly share with your friends and colleagues, I would greatly appreciate your help in spreading the word
Hello, I think one correction, I think even if the dimension table(songs) don't have filter condition on release date still DPP would work right?? as it will forward the release date selected after the filter, irrespective of the filter condition. eg even if we apply filter on songID in songs table is there and after filter few record are selected in those records whatever the release dates are it will be forwarded.
Hey @anandchandrashekhar2933 Appreciate it :) On the question - DPP is different from "filter pushdown", although it uses filter pushdown to prune the large dataset based on the filters from the smaller dataset. It's effective when you have a large and a small dataset (which can be broadcasted) and want to use the small dataset to filter records from the large dataset at scan-time
Thanks, @sathyamoorthy2362, for the kind words. On the video quality, I was trying out a new tool and it didn't work out, but hope the other ones are good and you like them :)
Hey @rohitshingare5352, Good question. DPP generally works best when one table is large and the other table is small enough to be broadcasted. The most significant reason for this if the two tables are large, the filters being moved will also be large (in the worst case) and this filter propagation mechanism over the network is the biggest bottleneck
I have completed this whole playlist, full of knowledge and it gave me enough confidence to handle questions.....
I saw your bucketing video and it was awesome then i decided to complete this whole playlist and here I am at the end.
I really learnt alot. please make more videos as you are our senior DE.😅
Hey @coolraviraj24, I'm elated to know this and super glad that the playlist has helped strengthen your concepts. Appreciate you putting out this note here. Please also share it w/ your friends and colleagues :)
Thanks for great video; you make these concept so simple. Thanks
you deserve more subscribers !! thanks for explaining the concepts
Those words mean a lot, thank you @gopinathdhanasekar328! If you wouldn't mind, a request to kindly share with your friends and colleagues, I would greatly appreciate your help in spreading the word
Great tutorials 🙏, please create more videos on spark from beginners point of view
Loving ur videos Bro !
thanks for another indeapth video yes we need how spark uses it's memory executors and on what basis it split data to multiple executors
Resource level optimisation videos upcoming in the next few weeks, stay tuned! :)
Thank you sharing , new thing I learned from you
Can you make a video on how to decide driver/executor memory size, no of executor based file size like 100 GB in Spark ?
Resource level optimisation videos upcoming in the next few weeks, stay tuned! :)
this is awesome, what tools do you use for drawing , recording and presenting?
Thanks @animeshrajjha, Ecamm Live for recording, Notes on iPad for drawing and Notion for writing :)
Hello, I think one correction, I think even if the dimension table(songs) don't have filter condition on release date still DPP would work right?? as it will forward the release date selected after the filter, irrespective of the filter condition. eg even if we apply filter on songID in songs table is there and after filter few record are selected in those records whatever the release dates are it will be forwarded.
Thanks Afaque. Terminology wise, Is this the same as Filter pushdown which you explained during the Query Plan video?
Hey @anandchandrashekhar2933 Appreciate it :)
On the question - DPP is different from "filter pushdown", although it uses filter pushdown to prune the large dataset based on the filters from the smaller dataset. It's effective when you have a large and a small dataset (which can be broadcasted) and want to use the small dataset to filter records from the large dataset at scan-time
All videos are great and nicely explained , video clarity is bad even for 4k.
Thanks, @sathyamoorthy2362, for the kind words. On the video quality, I was trying out a new tool and it didn't work out, but hope the other ones are good and you like them :)
What if both datasets are too big , so in that case broadcast exchange is still happens?
Hey @rohitshingare5352, Good question. DPP generally works best when one table is large and the other table is small enough to be broadcasted. The most significant reason for this if the two tables are large, the filters being moved will also be large (in the worst case) and this filter propagation mechanism over the network is the biggest bottleneck
Dead gorgeous stuff.
Appreciate it man :)