Thanks DJ jiang for pointing out that negative offset of LEAD or LAG works in PostgreSQL and Snowflake. I was referring to T-SQL that negative offset is not allowed and I should have clarified it in the video. Details here docs.microsoft.com/en-us/sql/t-sql/functions/lag-transact-sql?view=sql-server-ver15.
Thanks Emma for sharing. What you shared helped me comprehend window functions. Most importantly, I applied those window functions to my day to day work, which boosted efficiency. Appreciated :)
Emma, your content is great. Your English is beautiful and clear. My respect For subscribers: I would like to use the comments to find ways to improve my English. If you are starting your journey in data analytics and you are a native speaker of English, we can be useful to each other) For example, weekly calls or something similar Best regards, Senior Data Analyst from Russia
Hi there, as a non-native speaker who's working in an English-speaking country, what I found most helpful is not to have English conversations, but to pay closer attention to all the sentences and expressions that come from the native speakers. Think about how you would have phrased your words vs. how they did. Find the difference and work on it. You could use everything as the source for that purpose: movies, youtube videos etc. But this is not to say that having conversations in English is not helpful. It is, but just maybe not as important as many people think. Cuz I know many non-native speakers, who have worked in English every day for many years, but still speak bad English.
This..... WAS FANTASTIC! I've got 4+ years of experience in SQL and I needed a quick refresher for window functions and this was exactly what I needed. Thank You!
Great video! One quick question: At 9:10, I thought SUM(value) OVER(ORDER BY value) will return cumulative sum; but SUM(value) OVER() ORDER BY value will return the SUM shown in the video. or is it a PostgreSQL thing? I'm a little confused now.
" I thought SUM(value) OVER(ORDER BY value) will return cumulative sum; but SUM(value) OVER() ORDER BY value will return the SUM shown in the video". This is the correct statement. There must have been an error in the video.
You are 100%. Reason: (1.) When ORDER BY is specified and FRAMING is allowed, the DEFAULT FRAME is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW i.e all values up to the current value in the partition ( 2.) When ORDER BY is not specified and FRAMING is allowed, the DEFAULT FRAME is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING i.e The entire partition
stopped in between of the video, the best part about the video is that it explains the difference between group by and window function. This is something not many youtuber do. I was struggling to visualize how it happens. Thanks a lot
Can you explain how this query works ? SELECT COUNT(p) OVER(order by p ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as p_count, p, c,d FROM `d.t` SELECT COUNT(p) OVER(order by p ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as p_count, p FROM `d.t` Why is the window affected when I specify c and d columns? I would expect to count 1 before, current one and next one. So for first row it would be 2 for next ones except last one 3 and for last one to be 3, because the window is made of 3 items so it will count 3 most of the time. But no.. it returns different counts that does not make any sense to me. I read the bigquery documentation and I really don't get it.
Sorry for another dumb question: I tried something like "select rank(a) over(order by a) from t" in PostgreSQL. It'll show error telling me to add WITHIN GROUP. But if I write "select rank() over(order by a) from t" then it's fine. Seems like I can't pass anything in the () after rank. Could anyone clarify?
Hello Emma, your videos are superb and great and very useful! Just one thing, we can use negative value when we use LEAD or LAG, basically lead(date,-1) over (order by date) equivalent of lag(date,1) over(order by date), which I tested in PostgreSQL and Snowflake, and they both works.
Thanks DJ for pointing it out! I was focusing on T-SQL docs.microsoft.com/en-us/sql/t-sql/functions/lag-transact-sql?view=sql-server-ver15 and should have mentioned it in the video. Thanks again!
@@emma_ding No worries, you are great! Your videos are very detailed and clear and well organized and very high level! Thank you and Keep up your hard works!
Thanks DJ jiang for pointing out that negative offset of LEAD or LAG works in PostgreSQL and Snowflake. I was referring to T-SQL that negative offset is not allowed and I should have clarified it in the video. Details here docs.microsoft.com/en-us/sql/t-sql/functions/lag-transact-sql?view=sql-server-ver15.
Hey Emma, these are great! Can you put this in a playlist called ACE The SQL interview and put future videos here about the topic? :) TYVM!
@@tahirmuhammad9646 Great suggestion, Tahir! Here you go ua-cam.com/play/PLY1Fi4XflWStrBVJc2o9SbSQp9dRt3l9W.html! :)
This is the best vedio talking about window fundction on UA-cam, thanks Emma! Please keep up the great work!
Your channel is the best on all DS interviews! well organized, clear and short!
Wow, thanks!
Thanks Emma for sharing. What you shared helped me comprehend window functions. Most importantly, I applied those window functions to my day to day work, which boosted efficiency. Appreciated :)
This video is so condensed and packed full concise and well explained materials. Thank you for the fantastic content.
Super clear and orginised, thank you!!! 👍
This was very helpful. Beautiful content
Emma, your content is great. Your English is beautiful and clear. My respect
For subscribers:
I would like to use the comments to find ways to improve my English. If you are starting your journey in data analytics and you are a native speaker of English, we can be useful to each other) For example, weekly calls or something similar
Best regards, Senior Data Analyst from Russia
Hi there, as a non-native speaker who's working in an English-speaking country, what I found most helpful is not to have English conversations, but to pay closer attention to all the sentences and expressions that come from the native speakers. Think about how you would have phrased your words vs. how they did. Find the difference and work on it. You could use everything as the source for that purpose: movies, youtube videos etc.
But this is not to say that having conversations in English is not helpful. It is, but just maybe not as important as many people think. Cuz I know many non-native speakers, who have worked in English every day for many years, but still speak bad English.
Thank you! 😃
This..... WAS FANTASTIC! I've got 4+ years of experience in SQL and I needed a quick refresher for window functions and this was exactly what I needed. Thank You!
Thanks! Glad you find it helpful :)
Emma luv u. Your explanations are very elaborative and succinct at the same time.
Very clear, detailed, and amazing explanation
SUBSCRIBED from more great videos adventure
Thank you
Great lecture!!! Exactly what I was looking to solve my problem at work. I am definitely subscribing to your channel.
So happy to hear you found my video helpful! Thanks for watching. 😊
Shouldn't the SUM() at 9:09 do a cumulative sum since order function is mentioned ?
Great video! One quick question: At 9:10, I thought SUM(value) OVER(ORDER BY value) will return cumulative sum; but SUM(value) OVER() ORDER BY value will return the SUM shown in the video. or is it a PostgreSQL thing? I'm a little confused now.
There must be an error in her table bc it should be a rolling sum bc the sum(value) cannot be the same if it is ordered by the value for all rows
" I thought SUM(value) OVER(ORDER BY value) will return cumulative sum; but SUM(value) OVER() ORDER BY value will return the SUM shown in the video".
This is the correct statement. There must have been an error in the video.
You are 100%. Reason: (1.) When ORDER BY is specified and FRAMING is allowed, the DEFAULT FRAME is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW i.e all values up to the current value in the partition ( 2.) When ORDER BY is not specified and FRAMING is allowed, the DEFAULT FRAME is ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING i.e The entire partition
Emma: How are you? I am from Brazil. Do you have a video about partitions and frames?
Hi Emma, what app do you use to record yourself and the screen?
Keep going!
The order on the playlist is not correct
awesome
It's sad that you can't use variable to specify rows in window functions.
The table at ~9:00 is wrong. should go with running total, running average, etc.
This is the best video about window functions that I have seen so far! Wow! Congrats and thanks!
Thank you very much! 🔥
stopped in between of the video, the best part about the video is that it explains the difference between group by and window function. This is something not many youtuber do. I was struggling to visualize how it happens. Thanks a lot
Glad to hear it was helpful for you, Krishnam! Thanks for watching. 💛
I just finished a couple SQL crash courses and this is by far the BEST and most comprehensive refresher video for window functions I watched. Kudos!
Can you explain how this query works ?
SELECT
COUNT(p) OVER(order by p ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as p_count,
p, c,d
FROM `d.t`
SELECT
COUNT(p) OVER(order by p ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as p_count,
p
FROM `d.t`
Why is the window affected when I specify c and d columns?
I would expect to count 1 before, current one and next one. So for first row it would be 2 for next ones except last one 3 and for last one to be 3, because the window is made of 3 items so it will count 3 most of the time.
But no.. it returns different counts that does not make any sense to me. I read the bigquery documentation and I really don't get it.
This is awesome, crystal clear!
Great content as always!
Sorry for another dumb question: I tried something like "select rank(a) over(order by a) from t" in PostgreSQL. It'll show error telling me to add WITHIN GROUP. But if I write "select rank() over(order by a) from t" then it's fine. Seems like I can't pass anything in the () after rank. Could anyone clarify?
I have been storing so many CTEs using the WITH statement lol. I think these window functions could make my queries less verbose.
Great content!
i don't get rank and dense rank
would you mind sharing some videos for python coding interview as well?
Hi Minjeong, this one might be helpful to you.
ua-cam.com/video/hAqg2dlNeUc/v-deo.html
but why does the rank function skip 4 and 6 though??
Its by design. Think Olympics when there is a tie.
Amazing explanation!
Glad you found it helpful, Raj! 😊
Hello Emma, your videos are superb and great and very useful! Just one thing, we can use negative value when we use LEAD or LAG, basically lead(date,-1) over (order by date) equivalent of lag(date,1) over(order by date), which I tested in PostgreSQL and Snowflake, and they both works.
Yes, but it's syntactic sugar) For example in McSQL it won't work
Thanks DJ for pointing it out! I was focusing on T-SQL docs.microsoft.com/en-us/sql/t-sql/functions/lag-transact-sql?view=sql-server-ver15 and should have mentioned it in the video. Thanks again!
@@emma_ding No worries, you are great! Your videos are very detailed and clear and well organized and very high level! Thank you and Keep up your hard works!
Nice job young lady 🙂