pyspark interview questions and answers | regular expression in pyspark| regexp_replace | #interview

Top 15 Spark Interview Questions in less than 15 minutes Part-2 #bigdata #pyspark #interview

day 3 | consecutive days | pyspark scenario based interview questions and answers

Василиса пошла В ПЕРВЫЙ класс! А что у вас в рюкзаке)))?

Statue of Liberty Helps Blind Man Cross Road #shorts

Прохожу маску ЭМОЦИИ🙀 #юмор

pyspark scenario based interview questions and answers |

DEwithDhairy

Переглядів 2 486

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 11 вер 2024

КОМЕНТАРІ • 11

@Tech.S7 3 місяці тому
Thanks for informative stuff.
Instead of specifying all conditions in the join.
Just we can specify only one condition ( I mean not required and Or conditions)
It works and fetch expected output.
Cheers!!
@PoojaM22 5 місяців тому
awesome bro! please keep up the work!
@siddharthchoudhary103 6 місяців тому
In last after finding unique record can we use collectlist by using group by on customer then using indexes as start and end location in withcolumn?
@user-ju4ih5xr8e 6 місяців тому ⁺³
here is my solution
#creating two dataframes for start and end
df1=df.select('customer','start_location').alias('a')
df2=df.select('customer','end_location').alias('b')
#checking for locations
df3=df1.join(df2,concat(col('a.customer'),col('a.start_location'))==concat(col('b.customer'),col('b.end_location')),'leftanti')
df4=df2.join(df1,concat(col('a.customer'),col('a.start_location'))==concat(col('b.customer'),col('b.end_location')),'leftanti')
#final output
df5=df3.join(df4,["customer"],'inner')
@user-dv1ry5cs7e 5 місяців тому
with t1 AS (select customer,start_loc from travel_data where start_loc not in (select end_loc from travel_data))
,t2 AS (select customer,end_loc from travel_data where end_loc not in (select start_loc from travel_data))
select t1.customer,t1.start_loc,t2.end_loc from t2 join t1 on t2.customer=t1.customer
@prabhatgupta6415 6 місяців тому
df1=df.select("customer","start_location")
df2=df.select("customer","end_location")
df3=df1.union(df2).groupBy("customer","start_location").agg(count("start_location").alias("count")).filter("count==1")
df3.alias("a").join(df3.alias("b"),["customer"],"inner").filter("a.start_location
@user-tm4zj2zz8x 5 місяців тому
from pyspark.sql.functions import collect_list, udf
from pyspark.sql.types import StringType
def loc(x, y):
a = [i for i in x if i not in y]
return a[0]
loc_udf = udf(loc, StringType())
df1 = df.groupBy('customer').agg(collect_list('start_location').alias('start_list'), collect_list('end_location').alias('end_list'))
display(df1)
df2 = df1.withColumn('start', loc_udf(df1.start_list, df1.end_list)).withColumn('end', loc_udf(df1.end_list, df1.start_list)).drop(*('start_list', 'end_list'))
display(df2)
@tradingwith10k10 3 місяці тому
No udf, no join, no subquery
display(df.groupBy("customer")
.agg(collect_set("start_location").alias("start_list"),collect_set("end_location").alias("end_list"))
.withColumn("start_location",array_except("start_list","end_list").getItem(0))
.withColumn("end_location",array_except("end_list","start_list").getItem(0))
.drop("start_list","end_list"))
@user-tm4zj2zz8x 5 місяців тому
from pyspark.sql.functions import collect_list, udf
from pyspark.sql.types import StringType
def loc(x, y):
a = [i for i in x if i not in y]
return a[0]
loc_udf = udf(loc, StringType())
df1 = df.groupBy('customer').agg(collect_list('start_location').alias('start_list'), collect_list('end_location').alias('end_list'))
display(df1)
df2 = df1.withColumn('start', loc_udf(df1.start_list, df1.end_list)).withColumn('end', loc_udf(df1.end_list, df1.start_list)).drop(*('start_list', 'end_list'))
display(df2)

Наступне

Автоматичне відтворення

pyspark interview questions and answers | regular expression in pyspark| regexp_replace | #interview

pyspark interview questions and answers | regular expression in pyspark| regexp_replace | #interview

Top 15 Spark Interview Questions in less than 15 minutes Part-2 #bigdata #pyspark #interview

Top 15 Spark Interview Questions in less than 15 minutes Part-2 #bigdata #pyspark #interview

day 3 | consecutive days | pyspark scenario based interview questions and answers

day 3 | consecutive days | pyspark scenario based interview questions and answers

Василиса пошла В ПЕРВЫЙ класс! А что у вас в рюкзаке)))?

Василиса пошла В ПЕРВЫЙ класс! А что у вас в рюкзаке)))?

Statue of Liberty Helps Blind Man Cross Road #shorts

Statue of Liberty Helps Blind Man Cross Road #shorts

Прохожу маску ЭМОЦИИ🙀 #юмор

Прохожу маску ЭМОЦИИ🙀 #юмор

ДНРівці та ЛНРівці найбільше знущалися над полоненими азовцями

ДНРівці та ЛНРівці найбільше знущалися над полоненими азовцями

5. kpmg pyspark interview question & answer | databricks scenario based interview question & answer

5. kpmg pyspark interview question & answer | databricks scenario based interview question & answer

Question 10: PWC Interview Questions | data engineers | #pyspark #bigdata #pwc #interview

Question 10: PWC Interview Questions | data engineers | #pyspark #bigdata #pwc #interview

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

Top 50 PySpark Interview Questions & Answers 2024 | PySpark Interview Questions | MindMajix

Top 50 PySpark Interview Questions & Answers 2024 | PySpark Interview Questions | MindMajix

10 PySpark Product Based Interview Questions

10 PySpark Product Based Interview Questions

Data validation between source and target table | PySpark Interview Question |

Data validation between source and target table | PySpark Interview Question |

day 8 | capgemini interview question | pyspark scenario based interview questions and answers

day 8 | capgemini interview question | pyspark scenario based interview questions and answers

4 Recently asked Pyspark Coding Questions | Apache Spark Interview

4 Recently asked Pyspark Coding Questions | Apache Spark Interview

24 Fix Skewness and Spillage with Salting in Spark

24 Fix Skewness and Spillage with Salting in Spark

ДОКАЗАЛ ЧТО НЕ КАБЛУК #shorts

ДОКАЗАЛ ЧТО НЕ КАБЛУК #shorts

Теона Контридзе о подарке Chanel для дочери

Теона Контридзе о подарке Chanel для дочери

Новый уровень твоей сосиски

Новый уровень твоей сосиски

А ВЫ ЛЮБИТЕ ШКОЛУ?? #shorts

А ВЫ ЛЮБИТЕ ШКОЛУ?? #shorts

Булли больше на улицу не выпускаем? 🌥 #симбочка #симба #булли

Булли больше на улицу не выпускаем? 🌥 #симбочка #симба #булли

В наш дом проникли неизвестные средь бела дня..😱🥷🏡

В наш дом проникли неизвестные средь бела дня..😱🥷🏡

Throwing Swords From My Blue Cybertruck

Throwing Swords From My Blue Cybertruck