Number of Partitions in Dataframe | Spark Tutorial | Interview Question

Spark Interview Questions | Spark Context Vs Spark Session

Spark Performance Tuning | Avoid GroupBy | Interview Question

Как найти себе жену? Больше - тут @stas.yornik.shorts

НА ЦЕ можна дивитись ВІЧНО! Такої ПАЛКОЇ зустрічі НІХТО НЕ ЧЕКАВ

To Brawl AND BEYOND!

Spark Join Without Shuffle | Spark Interview Question

TechWithViresh

Переглядів 21 758

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 16 січ 2025
#Spark #Join #Internals #Performance #optimization #DeepDive #Join #Shuffle: In this video , We have discussed how to perform the join without the shuffle.
Please join as a member in my channel to get additional benefits like materials in BigData , Data Science, live streaming for Members and many more
Click here to subscribe : / @techwithviresh
About us:
We are a technology consulting and training providers, specializes in the technology areas like : Machine Learning,AI,Spark,Big Data,Nosql, graph DB,Cassandra and Hadoop ecosystem.
Mastering Spark : • Spark Scenario Based I...
Mastering Hive : • Mastering Hive Tutoria...
Spark Interview Questions : • Cache vs Persist | Spa...
Mastering Hadoop : • Hadoop Tutorial | Map ...
Visit us :
Email: techwithviresh@gmail.com
Facebook : / tech-greens
Twitter :
Thanks for watching
Please Subscribe!!! Like, share and comment!!!!

КОМЕНТАРІ • 29

@shivrajsingh5559 3 роки тому ⁺²
That's what i was looking for. It's a great help Viresh
@mrkrish501 4 роки тому ⁺¹
i m really happy with your in deep dive spark. Thank you.
@gemini_537 3 роки тому ⁺²
small2 is not defined. Also why is the shuffle cost of partitioning the 2 RDDs separately lower than the shuffle cost of joining them directly? They are basically doing the same thing, moving data of the same join key to a same executor.
@gemini_537 3 роки тому ⁺²
I feel the title is misleading, repartitioning the 2 RDDs involves shuffle.
@MohitKumar-st3ms 4 роки тому ⁺³
Let's say if you are having two large dataframe , then How will you optimize the join ? And why are you using the rdd as it's very slow as compared to dataframe ?
@adamantnams 4 роки тому ⁺¹
Any suggestions for dataframes?
@Trip-Train Рік тому
Why are you converting dataframe to rdd ?? It is very bad practice in terms of performance
@gemini_537 3 роки тому ⁺¹
What's the benefit of persisting the 2 RDDs?
@SpiritOfIndiaaa 4 роки тому ⁺¹
thanks Veresh , here "rdd"s been used , how to do same using Dataset/Dataframe ?? where you got "small2" from??
@SpiritOfIndiaaa 4 роки тому ⁺²
really nice , thanks bro , in line 14 , is it "small.partition.get" instead "small2.partition.get" right ? why shuffle.partitions set to 2 only ?
@TechWithViresh 4 роки тому
Otherwise remaining 198 partitions would be empty
@SpiritOfIndiaaa 4 роки тому
@@TechWithViresh is it otherwise or other words ? want to keep 198 partitions empty ?
@naveenkumar-tb1de 4 роки тому ⁺¹
I have been asked like, if I have 2 tables with same volume of data but say one has 10 column and other has 3 columns, how to optimise this joining.
@gemini_537 3 роки тому
What's the book/picture in the video?
@monku1821 3 роки тому ⁺¹
have been following the series, its pretty good but this video is not at all clear, you should make another with same question
@Mryajivramuk 3 роки тому
Concept is really worth testing.
Code is incomplete at places .
I took time to fill gaps.
Last line display()..will it work in scala spark ?🙄
@TechWithViresh 3 роки тому
This code will run fine on Azure Databricks.
@IndianCoupleinUKBLR 4 роки тому
where did small2 came from .....there is typo mistakes...can you please update it.??
@keyaar3393 3 роки тому
shuffle during join OR doing repartition before join .... u r saying that the second one is better.... right? Whats the difference? u have not mentioned why is it better... some one has to take care of repartitioning -> either join will shuffle or we have to repartition -> its fine... pls let us know why this approach is better.
@rishigc 4 роки тому
Even with repartitioning we have to move data to different partitions causing a shuffle, isnt it ?
@shankargs7685 4 роки тому ⁺¹
partition.get is returning None in largeRDD line no. 14
@rohinirithe1522 4 роки тому
getting error for line number 14 --->
error: value partitioner is not a member of org.apache.spark.sql.DataFrame
Kindly suggest
@saurabhgarud6690 4 роки тому ⁺¹
Very Nice content provided on this channel thanks for that, Q:- Can range partition work here ?
@dipanjansaha6824 4 роки тому ⁺¹
How to connect with you?
@TechWithViresh 4 роки тому ⁺²
TechWithViresh@gmail.com
@sagarrawal7740 Рік тому
Video recommendatin at the end are blocking the content...
@dheerendrakumarjain6672 3 роки тому
your example is not up to the mark, whatever you describe in your lecture it is not understandable, only the shake of creating a video you do this, I did not get your point whatever you told us regarding the join how it happens and what happens please describe in a much better understandable manner.

Наступне

Автоматичне відтворення

Number of Partitions in Dataframe | Spark Tutorial | Interview Question

Number of Partitions in Dataframe | Spark Tutorial | Interview Question

Spark Interview Questions | Spark Context Vs Spark Session

Spark Interview Questions | Spark Context Vs Spark Session

Spark Performance Tuning | Avoid GroupBy | Interview Question

Spark Performance Tuning | Avoid GroupBy | Interview Question

Как найти себе жену? Больше - тут @stas.yornik.shorts

Как найти себе жену? Больше - тут @stas.yornik.shorts

НА ЦЕ можна дивитись ВІЧНО! Такої ПАЛКОЇ зустрічі НІХТО НЕ ЧЕКАВ

НА ЦЕ можна дивитись ВІЧНО! Такої ПАЛКОЇ зустрічі НІХТО НЕ ЧЕКАВ

To Brawl AND BEYOND!

To Brawl AND BEYOND!

«Просив пробачення, що не уберіг Діму» - історія братів Василя Репчука і Дмитра Мурару #shorts

«Просив пробачення, що не уберіг Діму» — історія братів Василя Репчука і Дмитра Мурару #shorts

Spark Performance Tuning | EXECUTOR Tuning | Interview Question

Spark Performance Tuning | EXECUTOR Tuning | Interview Question

Spark Interview Question | Bucketing | Spark SQL

Spark Interview Question | Bucketing | Spark SQL

Spark Scenario Interview Question | Persistence Vs Broadcast

Spark Scenario Interview Question | Persistence Vs Broadcast

Spark Join and shuffle | Understanding the Internals of Spark Join | How Spark Shuffle works

Spark Join and shuffle | Understanding the Internals of Spark Join | How Spark Shuffle works

Spark Performance Tuning | Handling DATA Skewness | Interview Question

Spark Performance Tuning | Handling DATA Skewness | Interview Question

Live Bigdata Interview | 3 YOE | Spark, Hive | Mock interview | Feedback

Live Bigdata Interview | 3 YOE | Spark, Hive | Mock interview | Feedback

74. Databricks | Pyspark | Interview Question: Sort-Merge Join (SMJ)

74. Databricks | Pyspark | Interview Question: Sort-Merge Join (SMJ)

Spark Performance Tuning | Performance Optimization | Interview Question

Spark Performance Tuning | Performance Optimization | Interview Question

Pyspark Advanced interview questions part 1 #Databricks #PysparkInterviewQuestions #DeltaLake

Pyspark Advanced interview questions part 1 #Databricks #PysparkInterviewQuestions #DeltaLake

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

Lp. Сердце Вселенной #60 РОЖДЕНИЕ ЛОЛОЛОШКИ [Финал] • Майнкрафт

Этот бой - Самое большое РАЗОЧАРОВАНИЕ за всю КАРЬЕРУ БУАКАВА!

Этот бой - Самое большое РАЗОЧАРОВАНИЕ за всю КАРЬЕРУ БУАКАВА!

Тайское мороженое в Калининграде

Тайское мороженое в Калининграде

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)

ШАЛОСТЬ (смешное видео, приколы, юмор, поржать)

ВОТ ПОЧЕМУ Япония живет в будущем 🤫 Утилизация масла #япония #токио #путешествия #shorts

ВОТ ПОЧЕМУ Япония живет в будущем 🤫 Утилизация масла #япония #токио #путешествия #shorts

Правильный подход к детям

Правильный подход к детям

Cute Baby Ties Up Dad And Wants To Play With His Phone #funny #fatherhoodlove#cute#fatherhoodmoments

Cute Baby Ties Up Dad And Wants To Play With His Phone #funny #fatherhoodlove#cute#fatherhoodmoments

😯 Подарила сыну БМВ, но не ожидала такой реакции на машину! | Новостничок

😯 Подарила сыну БМВ, но не ожидала такой реакции на машину! | Новостничок