SQL Take-Home Challenge Tutorial
Вставка
- Опубліковано 21 лип 2024
- Today I'm solving an example SQL take-home challenge from Goodwater.
Check out our SQL interview guide: www.interviewquery.com/blog-s...
👉 Subscribe to my data science channel: bit.ly/2xYkyUM
Use the code "datasciencejay" and get 10% off data science interview prep 🔥 : www.interviewquery.com/pricin...
❓ Check out our SQL course: www.interviewquery.com/course...
🔑 Get professional coaching here: www.interviewquery.com/coachi...
🐦 Follow us on Twitter: / interview_query
Quick Links:
Introduction: 0:00
Question - 2:10
SQL Coding - 5:10
Follow Jay on:
Twitter: / datasciencejay
LinkedIn: / jay-feng-ab66b049 - Наука та технологія
Thanks Jay, enjoyed the challenge! Union removes duplicates, so if the same product is downloaded the same number of times on iOS and Android on a day, union only keeps one of the records. So, use union all instead!
I'm getting ready to have a few take-home SQL tests for some jobs I am interviewing for. This is wonderful. Thanks for putting this video together.
why do we have to create two temporary tables? Cant we just use "select trailing_downloads as dd1, trailing_downloads as dd2"? or is it just a mysql thing? thanks
The question defines trailing n-day average as D-n+1, so for 28 day trailing average, the code should be looking at datediff of 27 rather than 28 in this video right?
Another method i'm thinking is rather than starting with join, join, union. What if we tidy the product table first with prod,cat,ios UNION prod,cat,android, then join to downloads. How would this perform against the video method?
Hey it's Jay! Nice vid, man
Hey guys! Thanks for watching and check out our latest 2021 ultimate SQL interview guide: www.interviewquery.com/blog-sql-interview-questions/
anywhere to download the table script for practicing? thanks in advance
I honestly feel more confident after seeing Jay debugging haha, we all went through this.
What are tips that we need to keep in mind when we are creating the tables?
If there’s no primary key in the table, do we need to add it ourselves?
Any other particular thing we should keep in mind?
Can you use a window function, instead of self join, to do the trailing 28 days?
SELECT product_id, date,
AVG(total_downloads) OVER(PARTITION BY product_id ORDER BY date ROWS BETWEEN CURRENT ROW AND 28 PRECEDING) AS avg_28
FROM cte
cool this was my first idea too (but it's hard to remember the ROWS BETWEEN syntax!
That assumes all the consecutive dates exist in every row, so 1 row back equals 1 date back. If 1 date is missing, 1 row back becomes 2 days back. The method in this video has no dependence on row count and purely looks at date values. Maybe the Window Function + specifying previous frame works with pre-proprocessing by generating a series of dates, then left joining given data to the generated continuous date series to ensure no time gaps. Then 1 row preceding will definitely be 1 day preceding too
you are the best!
Thanks!
would this join work for part 1?
select p.product_id, d.date, sum(d.downloads) as num_downloads
from products p join downloads d on p.ios_app_id = d.app_id or p.android_app_id = d.app_id
exactly what I thought
The union was completely unnecessary, just do "ON (d.app_id = p.android_app_id OR d.app_id = p.ios_ap_id)" that way you only query both of your tables only 1 time which is optimum, no need for subquery and also your code would be easier to read, always keep in mind the next person taking care of your code and do clean coding, that saves time to everybody
+ 1 , but good video Jay!
cool, I was thinking of the same OR clause for the join but I've never seen it before, will it work in real life?
@@brothermalcolm yes, I've used them for many years, it all comes to the usage of the parenthesis so you don't mess up other clauses
@@lehast Could you explain the performance advantage of OR over UNION? stackoverflow.com/questions/19725014/sql-server-left-join-with-or-operator This link says UNION rather than OR to avoid table scanning. I don't understand though why the UNION method won't table scan too? It seems if we have 10 rows in left table, and 20 rows in each of two right tables, then doing OR would be like 10x(20+20) which is the same as 10x20 union 10x20 so both ways should be equally fast?
1st comment.. keep em comin!!
Next time please keep full screen of code
God dam Kevin