Facebook's Most Common Data Science SQL Interview Question [2021 Interview Question and Answer]

Поділитися
Вставка
  • Опубліковано 12 січ 2025

КОМЕНТАРІ • 64

  • @dwaipayansaha4443
    @dwaipayansaha4443 2 роки тому +1

    Hi Nate, my solution is:-
    select (count(*))/(select count(*) from fb_active_users where country='USA') as monthly_share from fb_active_users
    where status='open' and country='USA';

  • @friendsplain
    @friendsplain Рік тому +1

    Nate your content is gold! Thanks for your detailed walkthrough of the problem and solution. It is a huge help for data science interviewees like me!

  • @MUSKAN0896
    @MUSKAN0896 3 роки тому

    This was really helpful! Thank you very much. Please do more such videos!

  • @relaxationmusic7094
    @relaxationmusic7094 2 роки тому

    Really nice video. Thank you so much!

  • @ranitdey5829
    @ranitdey5829 2 роки тому

    Your content is amazing. I love the way you have explained the solutions.

  • @adwaithks
    @adwaithks Рік тому

    I did it this way :
    with usa_people_count as (
    select count(*) as total_people_count from fb_active_users where country = 'USA'
    ), active_people_count_in_usa as (
    select count(*) as active_people_count from fb_active_users where country = 'USA' and status = 'open'
    )
    select cast(active_people_count as decimal) / total_people_count as share from usa_people_count, active_people_count_in_usa;

  • @mrblahblihblih
    @mrblahblihblih 3 роки тому +5

    Hi Nate, awesome video! Your content is really helping me with my upcoming interview with Facebook. I was wondering, would it have been possible to just do a left join with two subqueries instead? Or are you just optimizing for runtime

    • @stratascratch
      @stratascratch  3 роки тому +3

      You could do that but it would be really inefficient and unnecessary. This query is actually already pretty inefficient. I wouldn't even use a subquery to answer this. I would just use the case statements in the numerator and perform a count(*) for the denominator. That is probably the most efficient way to solve it. If you go to the platform, there is a tab called "solution from other users" where you'll find other ways users have solved the question. Some are really clever and concise. Goodluck on FB!

    • @mrblahblihblih
      @mrblahblihblih 3 роки тому

      @@stratascratch got it thanks! Would you say writing the most efficient query is important in the interview?

    • @stratascratch
      @stratascratch  3 роки тому +1

      @@mrblahblihblih It's not that important in an interview. Not THE most efficient that is. Your code should make sense and not have a bunch of unnecessary sections/clauses. You'll usually have an opportunity in the end to optimize it so you can tell the interviewer how you'd write more efficient code.

  • @PATRICKCHUAD
    @PATRICKCHUAD 3 роки тому +1

    Hi Nate, now I'm beginning to practice the steps on solving the problem, not just go directly and dive to the coding. your steps is very logical and can be very advantageous in more complex question. Divide and conquer is the focus to come up with a logical solution. Thanks for continuous sharing your knowledge.

    • @stratascratch
      @stratascratch  3 роки тому +1

      Thanks for continuing to watch the videos. I'll keep pumping more and more out!

    • @PATRICKCHUAD
      @PATRICKCHUAD 3 роки тому

      @@stratascratch BTW I'm incline to subscribe to your platform for Premium account to learn in deep on SQL solution for data science and maybe python as well. Does StrataScratch team help out if member is having difficulty solving any of hard problem in there ?

    • @stratascratch
      @stratascratch  3 роки тому

      @@PATRICKCHUAD Hey that's great. yes, we do help with any questions you have. There's a discussion board for each question. If you leave a question, we can answer them for you. It takes about 2-3 days for an answer. Check out the free questions to see the video solutions and discussion forum before upgrading. See if you like it!

    • @PATRICKCHUAD
      @PATRICKCHUAD 3 роки тому

      @@stratascratch Yes I have visited the site already several time and also tried to solve the problem in there. That is why I want to upgrade to premium to know if my solution is correct and also to know other solution possible to solve the problem. I pretty much like the platform. I maybe get the 1 year premium plan since it is a lot cheaper compared to the monthly one.

    • @PATRICKCHUAD
      @PATRICKCHUAD 3 роки тому

      Seems no paypal option for payment. Is it correct ?

  • @ranajparida5412
    @ranajparida5412 2 роки тому

    Hey Nat, by looking the question, i am not able to find where question is asking about the ratio..???

  • @hhliu7287
    @hhliu7287 3 роки тому +1

    Hi Nate, I didn't get it why you didn't select the distinct user during your code. as count(case when status='open' then user_id else null)
    if a user showed twice, it will be 2 for his count

    • @stratascratch
      @stratascratch  3 роки тому

      That's a great point! I should have caught that edge case but I didn't. We should definitely add a distinct so we don't double count.

  • @MrJohn2brown
    @MrJohn2brown 3 роки тому +1

    Hey @ Nate, great video’s and thanks so much for the practical scenarios. I have been asked about Mapping in a data analysis interview and was told alot of mapping will be part ofthe job. Any videos on that any time soon ? Thanks

    • @stratascratch
      @stratascratch  3 роки тому

      Hi Tahir. Great job on your analysis interview so far. Can you elaborate on mapping? What do you mean exactly?

    • @MrJohn2brown
      @MrJohn2brown 3 роки тому

      @@stratascratch thank you, Hopefully better luck next time. Mapping data base. I didn’t understand and probably thats why I didn’t pass. The company had Apache Hive. Im guessing it had to do with creating tables not just query from them. Is that a normal task of a data analysts ?

    • @stratascratch
      @stratascratch  3 роки тому

      @@MrJohn2brown To me mapping means creating tables where you're mapping one table to another. Basically it's an intermediate table that connects 2 other tables together. Sometimes there's a lot of clean up and logic required to map 2 tables together so a mapping/intermediate table can be used. Or another definition could be creating a table that has data from multiple other tables together. Like if you have a user table but some info comes from engagement tables and other info comes from a payments table. That could be considered a mapping table too. Both require creating databases. Hope that helps.

    • @MrJohn2brown
      @MrJohn2brown 3 роки тому

      @@stratascratch I see, so its like joining on temp tables. Thanks Nate

    • @stratascratch
      @stratascratch  3 роки тому

      @@MrJohn2brown Similar yes. That's my guess without actually seeing the question or exercise you were required to do =)

  • @radhsri
    @radhsri 3 роки тому

    Hi Nate, thanks for the awesome videos. At 7:13, I dont understand how count(NULL) will NOT count that particular row. I was going to use SUM(CASE(WHEN status = open, 0, 1)) but will a COUNT skip a row having a NULL value? thanks

    • @stratascratch
      @stratascratch  3 роки тому +2

      That's basically how the functions are meant to work. count() won't count any NULL rows and sum() will count all rows whether Null or not. In your case the sum() + Case statement is how you should approach it if you want to count only opens. Here's more info discuss.codecademy.com/t/when-do-we-use-count-or-sum/351543

  • @Zzzz-hk5ft
    @Zzzz-hk5ft 3 роки тому

    Solid thanks man

  • @madhulikasuman2803
    @madhulikasuman2803 2 роки тому +1

    hey guys, does the fb data analyst interviews has whiteboard or IDE for sql test?

    • @stratascratch
      @stratascratch  2 роки тому

      From what I recall, their video screening (1st round) is a SQL test that's on their platform called BlueJeans (or something like that). Basically it's a glorified notepad. You can't execute any code. The in-person rounds are whiteboard. Hope that helps!

    • @madhulikasuman2803
      @madhulikasuman2803 2 роки тому

      @@stratascratch yes..that is right, thanks alot 😊

  • @chrisshaw1707
    @chrisshaw1707 3 роки тому

    Is casting the ratio as a float something that needs to be done only in certain SQL environments? The mysql editor on leetcode does produce a ratio when dividing an integer by another integer. Maybe this is because I used the round function though to go out 2 decimal places?

    • @stratascratch
      @stratascratch  3 роки тому

      It depends on the data type of the number. If they are both integers and you're dividing them, then you'll get an integer as the output, regardless of SQL engine being used. But if one of the numbers is a numeric or float then you'll get back a float/numeric. I would be wary of sql editors on platforms like leetcode, hackerrank, and stratascratch because there's a lot of things going on in the background that are outside of normal sql behavior. For example, with stratascratch we pass the sql output to python so that we can run our solution validation algorithm to see if you got the right answer. The conversion from sql to python isn't always clean and isn't always tied to how sql would normally behave...that's why you see our float outputs rounded to the nearest 3rd decimal. Hope this makes sense.

    • @chrisshaw1707
      @chrisshaw1707 3 роки тому

      @@stratascratch Awesome answer, thanks!

  • @charansai1133
    @charansai1133 3 роки тому

    Hi nate your vedios boost my confidence man Thanks a Lot

    • @stratascratch
      @stratascratch  3 роки тому

      Glad to have helped! Good luck on your interviews if you have any lined up

    • @charansai1133
      @charansai1133 3 роки тому

      @@stratascratch I want to do my master's on data engineering any suggestion

  • @zmey2k
    @zmey2k 3 роки тому

    I wrote my code just before you explained your solution and the only thing I changed is I passed the step with doubling user_id checking, instead I counted them with distinct prefix "count(distinct user_id)", everything else was the same (including nested select).

    • @stratascratch
      @stratascratch  3 роки тому

      That looks like it works! Great job and thanks for following along with the video.

  • @tarekelias463
    @tarekelias463 3 роки тому

    Thanks a lot for that!

  • @vrajpatel8256
    @vrajpatel8256 3 роки тому

    Great video!

  • @aishwaryareddy7454
    @aishwaryareddy7454 2 місяці тому

    how to reply if the interviewer ask, " if you are solving the question in intervals, will you be doing the same while on the job?"
    Just want to think of all possible questions can be asked in the coding interview.

  • @nilsaha8021
    @nilsaha8021 2 роки тому

    User_id 82: country Australia, User_id 34: Donald Ross country China in the table. But when you filtered out the country, both of those users are shown in the output. Could you please explain it?

    • @stratascratch
      @stratascratch  2 роки тому

      What's the code you used? Not sure why you're getting that

  • @shobhamourya8396
    @shobhamourya8396 3 роки тому

    Here's my solution:
    select count(case when status = 'open' then user_id else null end)*1.0/count(*)*1.0 as active_user_share
    from fb_active_users
    where country = 'USA'

  • @bindidesai2191
    @bindidesai2191 2 роки тому

    An attendance log for every student in a school district attendance_events :
    date | student_id | attendance
    • A summary table with demographics for each student in the district all_
    students : student_id | school_id | grade_level | date_of_birth | hometown
    Using this data, you could answer questions like the following:
    • What percent of students attend school on their birthday?
    • Which grade level had the largest drop in attendance between yesterday
    and today?

  • @mensenvau
    @mensenvau 3 роки тому

    How can I apply for a Data Base on facebook?.

  • @priyankalad7789
    @priyankalad7789 3 роки тому

    I am afraid if in real interviews we can test parts of logic or carry out code increment? What do we do in that case?

    • @stratascratch
      @stratascratch  3 роки тому +2

      Yea total valid point. In an interview, I would just split up the logic into steps and then write the code in steps. Then pause so that the interviewer can evaluate the code. This way they can catch any problems before you write out the entire solution. This framework has saved me several times on interviews =)

  • @cookiemonstauhh
    @cookiemonstauhh 3 роки тому +1

    Hi Nate i just wanna say thanks for the videos! Your data science interview question with solutions are really helpful and engaging:)

    • @stratascratch
      @stratascratch  3 роки тому

      Thank you and thanks for watching these videos. I try to make these videos from the POV of the interviewee or at least how I would want an interviewee to answer the question if I was the interviewer.

  • @neelanshunisingh971
    @neelanshunisingh971 3 роки тому

    Which portal you use for practise?

  • @adolfoir6177
    @adolfoir6177 3 роки тому

    Hi Nate,
    Does the SQL implementation (Oracle, MySQL, PostgreSQL..) matters when interviewing with FB?
    Which one would you recommend for FB particularly

    • @stratascratch
      @stratascratch  3 роки тому +2

      It doesn't at all. You basically just get a notepad when interviewing with FB. None of your code will execute so you can write in any SQL flavor you want.

    • @adolfoir6177
      @adolfoir6177 3 роки тому

      @@stratascratch
      Thank you Nate, this is very helpful

  • @Arslanqadri
    @Arslanqadri 2 роки тому +2

    you are not allowed to explore the data

    • @intrepid_grovyle
      @intrepid_grovyle 2 роки тому

      yeah exactly. during the interview you are just given a preview, and a static CoderPad environment with no way to run and test output

  • @techiewithcamera
    @techiewithcamera Рік тому

    My Solution:
    select us_active_users/us_total_users::decimal share_active_users
    from
    (select
    count(case
    when status='open' and country='USA'
    then user_id else null end) us_active_users,
    count(case
    when country='USA'
    then user_id else null end) us_total_users
    from fb_active_users)a;