Tricky Data Science Interview Question [By Facebook]

Поділитися
Вставка
  • Опубліковано 9 січ 2025

КОМЕНТАРІ • 37

  • @kayodewilliams9118
    @kayodewilliams9118 3 роки тому +2

    Super straight forward.. Thank you. I wouldn't have thought through using a union.

    • @stratascratch
      @stratascratch  3 роки тому

      Thanks for watching! I did say it was a tricky problem =) Thanks for following along.

  • @joaopedroreissilva7075
    @joaopedroreissilva7075 3 роки тому +2

    Really great, Nate.
    I'm not sure if it would be better based on SQL theory, but I'd prefer to use dense_rank instead of join at the end of the code.
    Because of organization and also if next will be necessary to find for example the 2º highest consumption or something like this, it's more practical.

    • @stratascratch
      @stratascratch  3 роки тому +1

      If you can support your reasoning for using any function, I saw go for it! Not all questions have a black and white solution. But if you can support your approach then your approach is usually correct. Especially if a stakeholder accepts it.

  • @prash516
    @prash516 3 роки тому

    This was brilliant Nate !
    Thanks :)

  • @NotFound-iu8wx
    @NotFound-iu8wx 4 роки тому +4

    Hello Nate, can we use rank window function and Union all to get the result?

    • @stratascratch
      @stratascratch  4 роки тому +2

      Yes you can do that. The UNION ALL would basically add up all the energy values by date. The rank window function would be used to find the max value I'm assuming? You can try it out on the platform and see if you get the same result as me. The link to the question is in the description. Let me know if you are able to get the same answer!

    • @ds_yat5918
      @ds_yat5918 4 роки тому

      @@stratascratch Got the same by using rank window function ! Thanks for such a great video!! Merry Christmas

    • @stratascratch
      @stratascratch  4 роки тому

      @@ds_yat5918 Merry Christmas and happy holidays!

    • @KamagongTree
      @KamagongTree 3 роки тому +1

      If the interviewer is a stickler about performance, they may ask you which would be less expensive. Window functions are usually seen as more efficient than joining more data.

    • @stratascratch
      @stratascratch  3 роки тому

      @@KamagongTree very very true. Especially when working large amounts of data. Joining more data, running subqueries and CTEs, can take up a lot of time and slow things down.

  • @portiseremacunix
    @portiseremacunix 4 роки тому

    Happy to learn something new today!

    • @stratascratch
      @stratascratch  4 роки тому

      Thanks! Glad you learned something new. let me know if you have any feedback or topic ideas.

  • @ahmedshehata9522
    @ahmedshehata9522 2 роки тому +1

    I really appreciate your videos, specially those kind of videos of getting actual production scenarios. Also, I am really happy that I had similar approach of your answer.
    The issue is that I am super un confident person. I was always on top of my class but i always get paranoia since i was student and felt that this issue ruined my career. Do you have any videos of this? i feel like i have to study everything online since i do not have much professional experience from my jobs to be able be confident with interviews with good companies :)

    • @stratascratch
      @stratascratch  2 роки тому

      I think you have conquered your first obstacle and that is acknowledging that there is a problem. From here on now, it will be easier to identify a solution for you. It is great that you continue to study and take online courses. I shared about creating projects with real business applications to practice and hone your skills and add to your portfolio. I also shared in my blogs some tips in an interview. Hope this helps you too. Hope through practice you can build up your confidence. www.stratascratch.com/blog/what-not-to-do-during-a-technical-interview/

  • @yashovardhan9841
    @yashovardhan9841 3 роки тому +1

    My approach -
    WITH temp
    AS (SELECT *
    FROM fb_eu_energy
    UNION ALL
    SELECT *
    FROM fb_asia_energy
    UNION ALL
    SELECT *
    FROM fb_na_energy),
    temp2
    AS (SELECT date,
    Sum(consumption) AS total,
    Rank()
    OVER (
    ORDER BY Sum(consumption) DESC) AS r
    FROM temp
    GROUP BY 1)
    SELECT date,
    total
    FROM temp2
    WHERE r = 1

  • @followmycrafts8811
    @followmycrafts8811 4 роки тому

    Thanks a lot for your Tutorial

    • @stratascratch
      @stratascratch  4 роки тому

      Thank you for watching! Feel free to let me know if you have any feedback or ideas for more topics.

  • @معاويةالفرارجة

    what If I used in python outer join and fill navalues with zero)
    ??

    • @stratascratch
      @stratascratch  Рік тому

      give it a try on the platform and see if your solution validates.

  • @PATRICKCHUAD
    @PATRICKCHUAD 3 роки тому

    Thanks for Sharing. Now just learn there is a command called CTE. very helpful tips.

    • @stratascratch
      @stratascratch  3 роки тому +1

      Yea, it's basically similar to creating a temp table but it can only be used for that query. Sort of like a subquery but at the top of your query.

    • @PATRICKCHUAD
      @PATRICKCHUAD 3 роки тому

      @@stratascratch I realized this is similar to a sub query I'm doing in MS access where sometimes it took 4 to 5 times subquery before I got the final result. Same the previous query is used in the next query.

  • @TheShrayansh
    @TheShrayansh 4 роки тому

    Wow❤️

    • @stratascratch
      @stratascratch  4 роки тому

      Glad you enjoyed it. Let me know if you have any feedback or requests for other topics.

  • @rakibraihanrimon8784
    @rakibraihanrimon8784 4 роки тому

    Thanks

    • @stratascratch
      @stratascratch  4 роки тому

      Glad you liked it. Let me know if you have any requests or feedback for me!

  • @LunaMarlowe327
    @LunaMarlowe327 3 роки тому

    gd

  • @ayeoh47
    @ayeoh47 2 роки тому +1

    this is a very non efficient way to solve this problem. please fix this video with a better solution

    • @stratascratch
      @stratascratch  2 роки тому

      lol, yea you're probably right. I have ideas of other ways to solve it. Care to give me your solution?

  • @kritiverma1342
    @kritiverma1342 2 роки тому

    which one is more optimized if given an option - union all or full join. I am thinking of having all the records using an outer join and then colasce to get the first non-null date, and adding up all the three consumption. Then next approach can be using rank or using a join on max energy.
    select COALSCE(eu.date, asia.date, na.consumption),
    (ISNULL(eu.consumption, 0) + ISNULL(asia.consumption, 0) + ISNULL(na.consumption, 0)) as summ
    from fb_eu_energy eu
    FULL JOIN fb_asia_energy asia
    ON eu.date = asia.date
    FULL JOIN fb_na_energy na
    ON eu.date = na.date ;

    • @stratascratch
      @stratascratch  2 роки тому +1

      My opinion is that UNION is more optimized because I've found that it can handle bigger datasets. Doing a full join will blow up the rows and you might run out of memory. It's less of a risk (but still a risk) using a UNION

    • @kritiverma1342
      @kritiverma1342 2 роки тому

      @@stratascratch Thanks for the quick reply.

  • @annachan6706
    @annachan6706 2 роки тому

    with all_consumed as (
    select * from fb_eu_energy
    UNION
    SELECT * FROM fb_asia_energy
    UNION
    SELECT * FROM fb_na_energy
    )
    SELECT
    sum(consumption),
    date
    FROM all_consumed
    group by 2
    order by sum(consumption) desc
    limit 1