Live Apache Spark Mock Interview | Spark | SQL | Databricks | Project based

Поділитися
Вставка
  • Опубліковано 20 сер 2024

КОМЕНТАРІ • 12

  • @ravulapallivenkatagurnadha9605
    @ravulapallivenkatagurnadha9605 3 місяці тому +1

    Much needed videos

  • @SanjayKumar-rw2gj
    @SanjayKumar-rw2gj 3 місяці тому +2

    Solution provider by interviewer is wrong

  • @hdr-tech4350
    @hdr-tech4350 Місяць тому

    3rd txn of every user
    Cumulative sum of sales amount of each product id

  • @soumen_22
    @soumen_22 3 місяці тому +2

    The cumulative sum question example output I think is wrong. As per my understanding it should be 90 instead of 210 in row4 of output example as in the input 6th row product id changed than 5th row.
    Even the solution shown will give wrong output. It will partition based on product_id and the cum sum of each product_id will come together like below.
    id product_id sales_date sales_amt cumsum
    1 1 2024-01-01 100 100
    2 1 2024-01-02 150 250
    4 1 2024-01-03 120 370
    6 1 2024-01-04 90 460
    3 2 2024-01-01 200 200
    5 2 2024-01-02 180 380
    But expectation was something else right if I am not wrong.

  • @subhashyadav9262
    @subhashyadav9262 3 місяці тому

    select * from (
    select *,ROW_NUMBER()over(partition by user_id order by transaction_date asc)cnt from int1
    )aa where cnt=3

  • @adiadi-xe7ds
    @adiadi-xe7ds 3 місяці тому

    @15:00
    Q: AQE already enabled in spark 3.0, if you still facing an out of memory error. what will be the solution?
    A: if we increase the shuffle partitions(default 200 to more than 200), will out of memory error resolves?

  • @laxmipoojamule4297
    @laxmipoojamule4297 3 місяці тому

    Sir please upload python video it's been more than 2 weeks