Azure Data Engineer Mock Interview | PySpark | Delta Live Tables| Managerial

Поділитися
Вставка
  • Опубліковано 20 сер 2024

КОМЕНТАРІ • 13

  • @imranhossain1660
    @imranhossain1660 4 місяці тому +8

    Optimize is a perfomance optimization technique available in delta lake table. Whenever we perform any kind of DML operations in Delta table each and every time it generates a new records. Over a period of time it generates a huge number of small files and it is kind of overhead for the delta engine to effectively perform the execution of our query as it eventually increases our resource usage such as i/o read/write and the computaion. Optimize command helps us to combine these small files to a larger file which eventually improves the performance of the delta table. As after this optimize operation, delta table refers to this latest snapshot file in order to retrieve results whever we query our table. And we can delete the obselete small files and free up the space with the help of vaccum command.

  • @prakashtripathi270
    @prakashtripathi270 4 місяці тому

    Thank you Sumit sir for arranging such a insightful session..

    • @sumitmittal07
      @sumitmittal07  4 місяці тому

      Always happy to help the community!

  • @user-jg2tn1wb3d
    @user-jg2tn1wb3d 4 місяці тому

    Thank u sumit sir

  • @maheshtiwari2297
    @maheshtiwari2297 4 місяці тому +1

    Hello sir, i have interview for big data/Etl developer at amazon please guide me for that.

  • @user-jg2tn1wb3d
    @user-jg2tn1wb3d 4 місяці тому

    Sir, need one video to know how bussiness requirement is, and how data engg gets the bussiness requirement and working strategy

    • @sumitmittal07
      @sumitmittal07  4 місяці тому +1

      Noted. Will have a session around this aspect!

    • @tahiliani22
      @tahiliani22 4 місяці тому

      I would like to add to this. If I am understanding correctly, @user_j% is talking about questions like Design Yelp but from a Database perspective.

  • @MsMohanj
    @MsMohanj 4 місяці тому

    Is it the join is correct or we can go for left join

  • @codinggeek9992
    @codinggeek9992 3 місяці тому

    Less questions from Azure Synapse....

  • @zaffer2024
    @zaffer2024 4 місяці тому

    Tough sql question, 😭

    • @user-nv6ho7uk8b
      @user-nv6ho7uk8b 4 місяці тому

      Check this.
      create table oldest_youngest(person varchar(10),type varchar(20),age int);
      insert into oldest_youngest values
      ('A1','ADULT',54),
      ('A2','ADULT',53),
      ('A3','ADULT',52),
      ('A4','ADULT',58),
      ('A5','ADULT',54),
      ('C1','CHILD',20),
      ('C2','CHILD',19),
      ('C3','CHILD',22),
      ('C4','CHILD',15);
      WITH ranked_adult AS
      (
      SELECT person as adult, ROW_NUMBER() OVER(ORDER BY age desc) as r_a FROM oldest_youngest where type = 'ADULT'
      ),
      ranked_child as
      (
      SELECT person as child, ROW_NUMBER() OVER(ORDER BY age asc) as r_c FROM oldest_youngest where type = 'child'
      )
      SELECT adult,child FROM
      ranked_adult a
      left join
      ranked_child c
      on a.r_a = c.r_c

  • @user-nv6ho7uk8b
    @user-nv6ho7uk8b 4 місяці тому +1

    create table oldest_youngest(person varchar(10),type varchar(20),age int);
    insert into oldest_youngest values
    ('A1','ADULT',54),
    ('A2','ADULT',53),
    ('A3','ADULT',52),
    ('A4','ADULT',58),
    ('A5','ADULT',54),
    ('C1','CHILD',20),
    ('C2','CHILD',19),
    ('C3','CHILD',22),
    ('C4','CHILD',15);
    WITH ranked_adult AS
    (
    SELECT person as adult, ROW_NUMBER() OVER(ORDER BY age desc) as r_a FROM oldest_youngest where type = 'ADULT'
    ),
    ranked_child as
    (
    SELECT person as child, ROW_NUMBER() OVER(ORDER BY age asc) as r_c FROM oldest_youngest where type = 'child'
    )
    SELECT adult,child FROM
    ranked_adult a
    left join
    ranked_child c
    on a.r_a = c.r_c