Flatten Nested Json in PySpark

Most Important Question of PySpark in LTIMindTree Interview Question | Salary in each department |

10 recently asked Pyspark Interview Questions | Big Data Interview

Ветеран війни отримав гроші на житло

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ❄️ ЗИМОВА ПРЕМ'ЄРА ❄️ 🇺🇦 ВИПУСК 154 на підтримку ЗСУ ⭐ Гумор ICTV від 13.12.2024

Commulative Salary - PySpark Interview Question

GeekCoders

Переглядів 2 240

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 27 січ 2025
Hello Everyone,
from pyspark.sql.types import StructType, StructField, StringType, IntegerType
from pyspark.sql.functions import col, sum
from pyspark.sql.window import Window
data = [
(1, "A", 1000),
(2, "B", 2000),
(3, "C", 3000),
(4, "D", 4000),
]
Define the schema for the DataFrame
schema1 = StructType([
StructField("ID", IntegerType(), True),
StructField("Name", StringType(), True),
StructField("Sal", IntegerType(), True)
])
df2 = spark.createDataFrame(data, schema=schema1)
df2.show()
This series is for beginners and intermediate level candidates who wants to crack PySpark interviews
Here is the link to the course : www.geekcoders...
#pyspark #interviewquestions #interview #pysparkinterview #dataengineer #aws #databricks #python
Розваги

КОМЕНТАРІ • 6

@chandanpatra1053 9 місяців тому ⁺¹
If you solve this question in sparksql is there any problem? In terms of optimization.
@GeekCoders 9 місяців тому
Same
@chandanpatra1053 9 місяців тому ⁺¹
@@GeekCoders then can you please tell if it can be solved sparksql then why to go with pyspark. Please make a video where you explain there are certain cases where sparksql fails to perform some transformation and pyspark comes into picture. And the question you solved is a running total which can easily be solved using sum (case when). I will wait for your reply. Can you please make a video on what to prepare for interview for people who are into different domain of IT and want to enter into data engineering and show experience around 1.5 yrs in Databricks. Which topics needs to be mandatory touched along with good grasp . I will wait for your reply. 🙏🙏🙏
@rawat7203 9 місяців тому
@@chandanpatra1053 pyspark > sparksql in terms of optimization
@tarunbhatt.1995 5 місяців тому ⁺²
data = [(1, "A", 1000),
(2, "B", 2000),
(3, "C", 3000),
(4, "D", 4000)]
schema1 = StructType([StructField("ID", IntegerType(), True),
StructField("Name", StringType(), True),
StructField("Sal", IntegerType(), True)])
df2 = spark.createDataFrame(data, schema=schema1)
df2.show()
df2.createOrReplaceTempView('emp')
df3 = spark.sql('select sum(sal) over (order by ID rows between unbounded preceding and current row) as Total_Sal from emp')
df3.show()
@souradeep.official 5 місяців тому
data = [
(1, "A", 1000),
(1, "A", 5000),
(2, "B", 2000),
(3, "C", 3000),
(4, "D", 4000),
]
schema=StructType([
StructField('id',IntegerType(),True),
StructField('name',StringType(),True),
StructField('salary',IntegerType(),True)
])
df=spark.createDataFrame(data,schema)
df.show(truncate=False)
window_spec=Window.orderBy(col('id')).rowsBetween(Window.unboundedPreceding,Window.currentRow)
df1=df.withColumn('cum_sal',sum(col('salary')).over(window_spec)).select(col('cum_sal'))
df1.show(truncate=False)

Наступне

Автоматичне відтворення

Flatten Nested Json in PySpark

Flatten Nested Json in PySpark

Most Important Question of PySpark in LTIMindTree Interview Question | Salary in each department |

Most Important Question of PySpark in LTIMindTree Interview Question | Salary in each department |

10 recently asked Pyspark Interview Questions | Big Data Interview

10 recently asked Pyspark Interview Questions | Big Data Interview

Ветеран війни отримав гроші на житло

Ветеран війни отримав гроші на житло

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ❄️ ЗИМОВА ПРЕМ'ЄРА ❄️ 🇺🇦 ВИПУСК 154 на підтримку ЗСУ ⭐ Гумор ICTV від 13.12.2024

ДИЗЕЛЬ ШОУ 2024 🇺🇦 ❄️ ЗИМОВА ПРЕМ'ЄРА ❄️ 🇺🇦 ВИПУСК 154 на підтримку ЗСУ ⭐ Гумор ICTV від 13.12.2024

Правильный подход к детям

Правильный подход к детям

38 Cummulative sales - PySpark Interview Question | Running Total sales

38 Cummulative sales - PySpark Interview Question | Running Total sales

Custom Logging in Databricks

Custom Logging in Databricks

Spark Scenario Based Question | Window - Ranking Function in Spark | Using PySpark | LearntoSpark

Spark Scenario Based Question | Window - Ranking Function in Spark | Using PySpark | LearntoSpark

Tiger Analytics PySpark Interview Question | Databricks |

Tiger Analytics PySpark Interview Question | Databricks |

120. Databricks | Pyspark| SQL Coding Interview: Employees Earning More Than Department Avg Salary

120. Databricks | Pyspark| SQL Coding Interview: Employees Earning More Than Department Avg Salary

5 Useful F-String Tricks In Python

5 Useful F-String Tricks In Python

I gave 127 interviews. Top 5 Algorithms they asked me.

I gave 127 interviews. Top 5 Algorithms they asked me.

6. what is data skew in pyspark | pyspark interview questions & answers | databricks interview q & a

6. what is data skew in pyspark | pyspark interview questions & answers | databricks interview q & a

Python Interview Questions & Answers for Freshers & Experienced Candidates #analyticsvidhya

Python Interview Questions & Answers for Freshers & Experienced Candidates #analyticsvidhya

The old man terrified the audience on AGT!!!😱#americasgottalent #magic #shorts #agt #miracle

The old man terrified the audience on AGT!!!😱#americasgottalent #magic #shorts #agt #miracle

Завжди знав,що він з України 😂😂😂

Завжди знав,що він з України 😂😂😂

😁 #рекомендации

😁 #рекомендации

до конца, там самая счастливая табалапка🐾🐾 #тикток #табалапка

до конца, там самая счастливая табалапка🐾🐾 #тикток #табалапка

У нас больше нет машины

У нас больше нет машины

Конец просто огонь 🔥

Конец просто огонь 🔥

Дон Чичи и бамбук

Дон Чичи и бамбук

Адвокат переиграл мажора в суде 🤵‍♂️😏

Адвокат переиграл мажора в суде 🤵‍♂️😏