Bro, thanks for your effort for sharing interview based real time questions and answers. can you please share realtime streaming (kafka and pyspark) based interview based questions and answers??.
Few guys will ask you to share code in both, spark sql and pyspark This way they can assess your pyspark and sql knowledge in single scenario based questions
bro, thanks for your inputs. below data is in a file. can you please help me how to handle this?. I got bit trouble on your one line string data with what i have in multiple rows with multiple delimiter. empid,fname|lname@sal#deptid 1,mohan|kumar@5000#100 2,karna|varadan@3489#101 3,kavitha|gandan@6000#102 Expected output empid,fname,lname,sal,deptid 1,mohan,kumar,5000,100 2,karan,varadan,3489,101 3,kavitha,gandan,6000,102
bro, thanks for your inputs. below data is a file.can you please help me how to handle this in pyspark? empid,fname|lname@sal#deptid 1,mohan|kumar@5000#100 2,karna|varadan@3489#101 3,kavitha|gandan@6000#102 Expected output empid,fname,lname,sal,deptid 1,mohan,kumar,5000,100 2,karan,varadan,3489,101 3,kavitha,gandan,6000,102
Thanks for the video, Please continue this pyspark interview videos.
Thanks Again.
Really a nice explanation with a clear shot... Thanks a lot... Please keep more videos on pyspark
Great questions!
Loving your channel more day by day
Really helpful.....Looking for more videos :)
Glad to hear that
@@thedatatech Please upload more videos and topic related videos in the description
awesome video
Only one video on the Pyspark Playlist ...
Pls post more!!
New subscriber added
Hi Bro
You always teach Amazing stuff
Great work 😊
I have request,
Can you please give syllabus or kind of preparation strategy for spark preparation.
Bro, thanks for your effort for sharing interview based real time questions and answers. can you please share realtime streaming (kafka and pyspark) based interview based questions and answers??.
Doubt in 1st question:
The delimiters in the data are "," "\t" "|"
Then why did you use ",|\t|\|"
Please explain.
split takes a regex pattern ",|\t|\|" means , OR \t OR |
Instead of left anti we can use except
i feel we can solve the 7th question using window function row_number() as well
in interview if we solve the problems in SQL using sparksql will it be okay?
It depends, sometimes, interviewer specially asks you not to use the sql and rather use the dataframe apis.
Few guys will ask you to share code in both, spark sql and pyspark
This way they can assess your pyspark and sql knowledge in single scenario based questions
bro, thanks for your inputs. below data is in a file. can you please help me how to handle this?. I got bit trouble on your one line string data with what i have in multiple rows with multiple delimiter.
empid,fname|lname@sal#deptid
1,mohan|kumar@5000#100
2,karna|varadan@3489#101
3,kavitha|gandan@6000#102
Expected output
empid,fname,lname,sal,deptid
1,mohan,kumar,5000,100
2,karan,varadan,3489,101
3,kavitha,gandan,6000,102
bro, thanks for your inputs. below data is a file.can you please help me how to handle this in pyspark?
empid,fname|lname@sal#deptid
1,mohan|kumar@5000#100
2,karna|varadan@3489#101
3,kavitha|gandan@6000#102
Expected output
empid,fname,lname,sal,deptid
1,mohan,kumar,5000,100
2,karan,varadan,3489,101
3,kavitha,gandan,6000,102
from pyspark.sql import SparkSession
from pyspark.sql.functions import split, col
spark = SparkSession.builder.appName("MyApp").getOrCreate()
df = spark.read.format("csv").option("header", "true").option("inferSchema", "true").option("delimiter", ",").load("D:/data/employees.csv")
exp_op = df.withColumn("fname", split(col("fname|lname@sal#deptid"), "\\|").getItem(0)) \
.withColumn("lname", split(split(col("fname|lname@sal#deptid"), "\\|").getItem(1),"@").getItem(0)) \
.withColumn("sal", split(split(col("fname|lname@sal#deptid"), "@").getItem(1), "#").getItem(0)) \
.withColumn("deptid", split(col("fname|lname@sal#deptid"), "#").getItem(1)) \
.select("empid","fname", "lname", "sal", "deptid")
exp_op.show()
#ouput
+-----+-------+-------+----+------+
|empid| fname| lname| sal|deptid|
+-----+-------+-------+----+------+
| 1| mohan| kumar|5000| 100|
| 2| karna|varadan|3489| 101|
| 3|kavitha| gandan|6000| 102|
+-----+-------+-------+----+------+