Tiger Analytics PySpark Interview Question | Databricks |

Поділитися
Вставка
  • Опубліковано 5 лют 2025
  • Year End Sale on every course
    Roadmap of 2025 for Data Engineers
    1. SQL
    2. Python
    3. Spark - Databricks - UC, DLT, Mosiac AI, etc
    4. Azure - Azure Data Factory, ADLS, IAM
    5. CI-CD using GitHub action, Azure Devops
    6. Data Modelling , Datawarehouse concepts
    7. End to End Projects
    Here is the promo code : BLKFRI20 - Get 35% off
    Here is the course links
    1. Databricks course - www.geekcoders...
    2. Data Modelling course - www.geekcoders...
    3. End to end projects - www.geekcoders...
    4. Azure Data Factory course - www.geekcoders...
    5. PySpark interview question - www.geekcoders...
    6. Full course - www.geekcoders...
    Question:
    from pyspark.sql import SparkSession
    from pyspark.sql.functions import col, count, when
    Initialize SparkSession
    spark = SparkSession.builder.appName("CTR_Calculation").getOrCreate()
    Input data
    data = [
    (123, "impression", "07/18/2022 11:36:12"),
    (123, "impression", "07/18/2022 11:37:12"),
    (123, "click", "07/18/2022 11:37:42"),
    (234, "impression", "07/18/2022 14:15:12"),
    (234, "click", "07/18/2022 14:16:12"),
    ]
    Schema
    columns = ["app_id", "event_type", "timestamp"]
    Create DataFrame
    df = spark.createDataFrame(data, columns)
    display(df)
    df_1=df.groupBy("app_id","event_type").count()
    display(df_1)
    df_2=df_1.groupBy("app_id").pivot("event_type",["impression","click"]).sum("count")
    display(df_2)
    df_3=df_2.withColumn("ctr",(col("click")/col("impression"))*100).drop("impression","click")
    display(df_3)
    #databricks #interview #azuredataengineer #databricks #datamodeling #projects #realtime

КОМЕНТАРІ • 5

  • @musicstore3629
    @musicstore3629 Місяць тому

    Can we not group by and then aggregate and inside of this use count and case when to calculate the ctr for this sagar? That can also be a solution isnt it?

  • @ganeshchintala6668
    @ganeshchintala6668 Місяць тому

    Is your course best for beginners who are entering into data engineering field? Does it cover everything?
    I know python , sql
    Hope i get reply

  • @kalyan8086
    @kalyan8086 Місяць тому +2

    Hi Sagar, by enrolling in your courses on your website, do we get access to those courses for the lifetime (similar to Udemy) or is it a limited time?