Azure Data Engineer Mock Interview | PySpark | Delta Live Tables| Managerial

Поділитися
Вставка
  • Опубліковано 1 січ 2025

КОМЕНТАРІ • 13

  • @imranhossain1660
    @imranhossain1660 9 місяців тому +8

    Optimize is a perfomance optimization technique available in delta lake table. Whenever we perform any kind of DML operations in Delta table each and every time it generates a new records. Over a period of time it generates a huge number of small files and it is kind of overhead for the delta engine to effectively perform the execution of our query as it eventually increases our resource usage such as i/o read/write and the computaion. Optimize command helps us to combine these small files to a larger file which eventually improves the performance of the delta table. As after this optimize operation, delta table refers to this latest snapshot file in order to retrieve results whever we query our table. And we can delete the obselete small files and free up the space with the help of vaccum command.

  • @prakashtripathi270
    @prakashtripathi270 9 місяців тому

    Thank you Sumit sir for arranging such a insightful session..

    • @sumitmittal07
      @sumitmittal07  9 місяців тому

      Always happy to help the community!

  • @VaidehiH-v2l
    @VaidehiH-v2l 9 місяців тому

    Thank u sumit sir

  • @maheshtiwari2297
    @maheshtiwari2297 9 місяців тому +1

    Hello sir, i have interview for big data/Etl developer at amazon please guide me for that.

  • @VaidehiH-v2l
    @VaidehiH-v2l 9 місяців тому

    Sir, need one video to know how bussiness requirement is, and how data engg gets the bussiness requirement and working strategy

    • @sumitmittal07
      @sumitmittal07  9 місяців тому +1

      Noted. Will have a session around this aspect!

    • @tahiliani22
      @tahiliani22 9 місяців тому

      I would like to add to this. If I am understanding correctly, @user_j% is talking about questions like Design Yelp but from a Database perspective.

  • @MsMohanj
    @MsMohanj 9 місяців тому

    Is it the join is correct or we can go for left join

  • @codinggeek9992
    @codinggeek9992 7 місяців тому

    Less questions from Azure Synapse....

  • @zaffer2024
    @zaffer2024 9 місяців тому

    Tough sql question, 😭

    • @Rakesh-q7m8r
      @Rakesh-q7m8r 9 місяців тому

      Check this.
      create table oldest_youngest(person varchar(10),type varchar(20),age int);
      insert into oldest_youngest values
      ('A1','ADULT',54),
      ('A2','ADULT',53),
      ('A3','ADULT',52),
      ('A4','ADULT',58),
      ('A5','ADULT',54),
      ('C1','CHILD',20),
      ('C2','CHILD',19),
      ('C3','CHILD',22),
      ('C4','CHILD',15);
      WITH ranked_adult AS
      (
      SELECT person as adult, ROW_NUMBER() OVER(ORDER BY age desc) as r_a FROM oldest_youngest where type = 'ADULT'
      ),
      ranked_child as
      (
      SELECT person as child, ROW_NUMBER() OVER(ORDER BY age asc) as r_c FROM oldest_youngest where type = 'child'
      )
      SELECT adult,child FROM
      ranked_adult a
      left join
      ranked_child c
      on a.r_a = c.r_c

  • @Rakesh-q7m8r
    @Rakesh-q7m8r 9 місяців тому +2

    create table oldest_youngest(person varchar(10),type varchar(20),age int);
    insert into oldest_youngest values
    ('A1','ADULT',54),
    ('A2','ADULT',53),
    ('A3','ADULT',52),
    ('A4','ADULT',58),
    ('A5','ADULT',54),
    ('C1','CHILD',20),
    ('C2','CHILD',19),
    ('C3','CHILD',22),
    ('C4','CHILD',15);
    WITH ranked_adult AS
    (
    SELECT person as adult, ROW_NUMBER() OVER(ORDER BY age desc) as r_a FROM oldest_youngest where type = 'ADULT'
    ),
    ranked_child as
    (
    SELECT person as child, ROW_NUMBER() OVER(ORDER BY age asc) as r_c FROM oldest_youngest where type = 'child'
    )
    SELECT adult,child FROM
    ranked_adult a
    left join
    ranked_child c
    on a.r_a = c.r_c