Recursive CTE | Leetcode Hard SQL Problem 5 | Complex SQL 12
Вставка
- Опубліковано 30 чер 2024
- In the first part of this video we are going to discuss how recursive CTE works. In second part we will discuss a leet code SQL hard problem where we will be using recursive CTE concept.
Playlist for complex SQL questions:
• Complex SQL Questions ...
create and insert script for this problem. Do try yourself without using CTE.
create table sales (
product_id int,
period_start date,
period_end date,
average_daily_sales int
);
insert into sales values(1,'2019-01-25','2019-02-28',100),(2,'2018-12-01','2020-01-01',10),(3,'2019-12-01','2020-01-31',1); - Наука та технологія
awesome!
I took all the parameters in the recursive CTE to avoid the joins.
with recursive dt_cte as (
select period_start,period_end,product_id,average_daily_sales from sales
union all
select adddate(period_start,1),period_end,product_id,average_daily_sales
from dt_cte
where period_start
Good stuff. Thanks for posting.
These videos are extremely helpful. I have searched a lot in UA-cam but none solves these many hard problems. Kudos to you :)
Thank you 😊
@@ankitbansal6 and good thing is you give create and insert statements. that makes whole experience awesome.
This logic is just amazing!!!!! Too good solution. Even that inner join is a brilliant idea.
Thank you taking this up. really helpful to figure out the approach to solve any problem.
Great way to explain a concept, thank you bro.
Thanks Ankit! ... and below is the query for Mysql workbench (If any one wants).
Note: Addon/modified keywords/parameters are ---> recursive , date_add(dates,interval 1 day)
with recursive t_sales as
(select min(period_start) as dates,max(period_end) maxdate from sales
union all
select date_add(dates,interval 1 day) as dates,maxdate from t_sales
where dates < maxdate)
select product_id,year(dates) as report_year,sum(average_daily_sales) as total_amount
from t_sales inner join sales
on dates between period_start and period_end
group by product_id,year(dates)
order by product_id;
keep rocking Ankit . Excellent explanation!
amazing problem very helpful video
🔥this is very useful, thanks for sharing..
Thanks for the question and solution.
Hi Ankit, can you please post some more questions on this topic? Also, how to understand when to use CTE/Recursive CTE?
Awesome, I learnt recusive CTE very well
Good job!
Learnt Something New today . Thanks for the video ... But question is how to remember what approach we need to take ? mostly seeing such type of questions all that comes in mind is to use case statement or any analytical functioin ..
Excellent explanation on recursive CTE.
Glad you liked it
Nice one, Ankit :)
From Oracle/SQl...
with d_dates (dates,max_dates)
as
(
select min(period_start) as dates, max(period_end) as max_dates from sales
union all
select dates+1 as dates,max_dates from d_dates
where dates
zindagi me pehli baar recursive CTE smjh me aya hai
Shaandar zabardast zindabad
Nice!
Thank you 😊
Ankit what will be the order of execution in case of recursive CTE?
cant we take first and last day of year in between and make days as 365? and then count the sum? I think making it recursive will be heavy process
i think we should also take productid in CTE and use it in join so that if there are overlapping date range across product will not create issue.
It doesn't matter. I am covering all the dates for table. Overlap dates also will get joined to multiple product IDs. I think overlap is already there..
MYSQL Solution
with recursive sales_data as (
select * from sales
UNION ALL
select t1.product_id,
date_add(t1.period_start, interval 1 day) as period_start,
t1.period_end,t1.average_daily_sales
from sales_data t1 join sales t2
on t1.product_id = t2.product_id
and t1.period_start < t2.period_end
)
select product_id,year(period_start) as report_year,
sum(average_daily_sales) as total_amount
from sales_data
group by 1,2 order by product_id
with r_cte as(
select min(period_start) as date,max(period_end) as max_date from sales
Union all
select DATEADD(day,1,date),max_date from r_cte where date
select * from sales;
with recursive cte1 as
( select min(period_start) as dates,max(period_end) as max_dates from sales
union all
select date_add(dates,INTERVAL 1 DAY) as date,max_dates from cte1
where dates
with cte as(
select min(period_start) as first_date, max(period_end) as last_date from sales2
union all
select dateadd(day,1,first_date) ,last_date from cte where first_date < last_date
)
select sum(average_daily_sales) as total_sales, year(first_date) as years,product_id from cte
join sales2 on first_date between period_start and period_end
group by year(first_date),product_id
option(maxrecursion 1000)
I have also used recursive CTE but in a different way. In the last, I haven't use any JOIN clause to solve this problem.
Here is my code (very short & simple code I have written) : -
with recursive cte as(
select product_id, period_start, period_end, average_daily_sales from sales
union
select product_id, date_add(period_start, interval 1 day) period_start, period_end, average_daily_sales
from cte where period_start < period_end)
select product_id, year(period_start) report_year, sum(average_daily_sales)
from cte group by 1,2
Hello Shubham, can you explain why you have done 'from cte' in second select. Why are we not writing 'from sales' ? . The CTE has not yet been created so how are we calling cte within the not yet created cte
When I started solving this problem without watching this video, I was thinking of this solution and one another option of finding a date diff between 2 dates and calculate the total sum. Thanks.
Today I was asked the same question. I thought of using CASE for getting the years. Couldn't answer it. Immediately after the interview, I search it on leetcode followed by youtube mentioning "Total Sales by Amount Ankit Bansal" :) Thank you!
🙏
Which company? and Profile?
with cte(product_id,period_start,period_end,avg) as
(
select product_id,period_start,period_end,average_daily_sales from sales --anchor query
union all
select product_id,period_start + interval '1' day,period_end,avg --recursive query
from cte
where period_start < period_end --filter to stop the recursion
)
select product_id,to_char(period_start,'YYYY'),sum(avg)
from cte --where product_id=3
group by product_id,to_char(period_start,'YYYY')
order by product_id;
Thanks for posting 👏
I have a usecase as below
There are n number of nodes connected to each other,
Every output link of a node matches with the input link of another node
I have to make them join based on inputlinkid and outputlinkid
But the challenge here is I don't know how many nodes a flow has , so I have to create columns dynamically where each column represents a node name.
Can you please help me how to create dynamic columns while running recursive cte
Hi ankit.. Can you please tell how to know when to use recursive cte
with recursive cte1 as(select "2018-12-01" as num
union all select date_add(num,interval 1 day) from cte1 where num
Hi Ankit, you've beautifully explained the concept. Can you please help mw with the join between 2 dates, how it works? Thanks!
its the same. just like you join a integer, similar is date.
Hey Ankit, Thanks a lot for making this video. I had one small doubt so I have the exact same problem but some of the values in end date is null in data. How am I supposed to handle that in the query.
So you need to put some default value to replace nulls. Maybe current date or check with business what is the expectation
tried a different approach in the end:
with cte as
(
select min(period_start) as md,max(period_end) as pe from sales
union all
select dateadd(day,1,md),pe from cte
where dateadd(day,1,md)
with rec_sales as
(
select product_id, min(period_start) min_period, max(period_end) max_period, average_daily_sales
from sales
group by product_id, average_daily_sales
union all
select product_id, dateadd(day, 1, min_period), max_period, average_daily_sales
from rec_sales
where min_period < max_period
)
select product_id, year(min_period) report_year, sum(average_daily_sales) total_amount
from rec_sales
group by product_id, year(min_period)
order by 1, 2
option (maxrecursion 1000)
Sir, this approch ok with 1 million record's
DECLARE @startDate DATE, @endDate DATE;
SELECT @startDate = MIN(period_start) FROM sales;
SELECT @endDate = MAX(period_end) FROM sales;
WITH date_tbl AS
(
SELECT @startDate AS date_col
UNION ALL
SELECT DATEADD(DAY,1,date_col)
FROM date_tbl
WHERE date_col < @endDate
),
sales_cte AS
(
SELECT s.*, d.date_col
FROM sales AS s
INNER JOIN date_tbl AS d
ON d.date_col BETWEEN s.period_start AND s.period_end
)
SELECT product_id, YEAR(date_col) AS report_year, SUM(average_daily_sales) AS total_amount
FROM sales_cte
GROUP BY product_id, YEAR(date_col)
ORDER BY product_id ASC
OPTION(MAXRECURSION 0);
with recursive r_cte as (
select * from salesnew
union all
select product_id,date_add(period_start,interval 1 day),period_end,average_daily_Sales from r_cte
where period_start
solved the problem using postgresql
below is the code
with recursive c1 as (
select *
,extract(year from period_start) as start_year
,extract(year from period_end) as end_year
,row_number() over (order by period_start) as rn
from sales)
,c2 as (
select product_id,period_start as sales_start_date
,case when start_year = end_year then period_end
else (start_year||'-12-31')::date end as sales_end_date
,average_daily_sales,start_year as sales_year,
start_year,end_year,period_start,period_end,rn from c1
union all
select c1.product_id
,(((c2.sales_year)+1)::varchar||'-01-01')::date as sales_start_date
,case when (c2.sales_year+1)=c1.end_year then c1.period_end
else ((c2.sales_year+1)||'-12-31')::date end as sales_end_date
,c1.average_daily_sales
,(c2.sales_year+1) as sales_year,c1.start_year,c1.end_year
,c1.period_start,c1.period_end,c1.rn from c1
inner join c2
on c1.rn = c2.rn
and (c2.sales_year + 1)
Can anyone tell me what is the alternative of option in mysql workbench?
@Ritika Jaiswal use this
with recursive rec_cte as
(
select min(period_start) as dates, max(period_end) as max_date from sales
union all
select date_add(dates, interval 1 day) as dates, max_date from rec_cte where dates < max_date
)
select year(dates) as yr, product_id , sum(average_daily_sales) as total_sales from rec_cte
inner join sales on dates between period_start and period_end
group by yr,product_id
order by yr, product_id
WITH RECURSIVE r_cte AS (
SELECT MIN(period_start) AS dates, MAX(period_end) AS max_date FROM sales
UNION ALL
SELECT DATE_ADD(dates, INTERVAL 1 DAY) AS dates, max_date FROM r_cte WHERE dates < max_date
)
SELECT product_id, year(dates) as yr, sum(average_daily_sales) as total_amount FROM r_cte join sales on dates between period_start and period_end
group by product_id, year(dates) order by product_id, year(dates);
Your Teaching
with recursive cte as (
select period_start as start, period_end as end,
product_id, average_daily_sales
from sales
UNION ALL
select DATE_ADD(start, INTERVAL 1 DAY), end, product_id, average_daily_sales from cte
where DATEDIFF(end, start) > 0
)
select YEAR(start), product_id, count(*), sum(average_daily_sales) from cte
group by YEAR(start), product_id
order by product_id, YEAR(start);
select product_id,yr,sum(sal2) from
(select product_id,to_char(st1,'yyyy') yr,l,sal2 from
(select product_id,period_start+l-1 st1,l,average_daily_sales sal2 from sales s,
lateral(select level l from dual connect by level
Upar se gaya bhai. 1-2 aur video bana ke samjha do please. Thode aur examples ke saath
Okay. Recursive CTE Ka concept samajh aaya kya ? Ya leetcode samajh nhi aaya ?
@@ankitbansal6 recursive cte ka concept samajh aa gaya... leetcode waala bro. Maine rewind kar ke dekha still difficulty hua samajhne me. Fir se try karta hu samajhne ka. Still ek baar ye topic pe thode aur videos bana do kuch examples ke please
@@swapnil3845 okay. Ek baar query khud run karke dekho step by step ..
@@ankitbansal6 okay bro. Try karta hu. Anyway thanks video ke liye bhai..new concept tha. Abhi isko seekhta hu acche se
this is very memmory consuming query , could hv been optmised
Hello Ankit, can you explain why you have done 'from r_cte' in second select. Why are we not writing 'from sales' ? . The CTE has not yet been created so how are we calling cte within the not yet created cte.
I understand the 1st select (creating anchor value) and the 3rd select perfectly and the remaining code as well as the logic behind it.
Or maybe this is what the syntax of recursion is 😅
yeah, that's the syntax of recursive cte and that's why it gets its tag "recursive"
.. If not, then nothing special with recursive cte right?.
Hello Sir,
Could you please let me know the query for below question:
INPUT:
Date OrderValue
01-Sep-2021 10
05-Oct-2021 20
11-Dec-2021 30
OUTPUT:
Date Value
30-Sep-2021 10
31-Oct-2021 30
30-Nov-2021 30
31-Dec-2021 60
We need to tell cumulative sum of ordervalue(for missing month as well)
Thanks!
(Sorry for commenting but going through these set of questions in interviews)
select a.last,sum(a.order_Value) over (order by last) val from
(select case when d.rn 1 then 0 else det.order_value end order_Value,case when d.rn 1 then add_months(lastday,-rn+1) else lastday end as last
from
(select in_date,last_day(in_Date) lastday,order_value,to_char(in_date,'MM') mon,
nvl(round((in_date - lag(in_date) over (order by in_date))/30),1) sub
from detail) det,lateral(select rownum rn from dual connect by level
Last month from sept to dec( using date function)
We will get 4 rows say this is temp table
Now temp left join your_table
On (to_char( your_table.date,'YYYYMM') = to_char(temp.date,'YYYYMM'))
Once this is created rolling sum can be created using windowing clause ( rows between unbounded preceding and current row)
Hope this helps
Here I try to solve your que. Be aware that my date types are in %Y,%m,%d format
CREATE table testone
(org_date date,
ordervalue int);
insert into testone values ('2021-09-01',10),
('2021-10-05',20),('2021-12-11',30);
Ans=>Query
----------
with recursive tab(min_date,max_date) as
(
select min(org_date),max(org_date) from testone
union all
select date_add(min_date,interval 1 month),max_date from tab where extract(month from min_date) < extract(month from max_date)
)
select
concat_ws(
'-'
,case when extract(month from min_date) in (9,11) then 30 else 31 end
,extract(month from min_date),extract(year from min_date)
) org_date,
sum(ordervalue) over(order by min_date) sum
from tab left join testone
on month(org_date) = month(min_date)
Hi Shubham, Tried to solve your question in SQL server using recursive cte. Below is the solution.
--First we will find the minimum date and last date of maximum date and find all the dates between them
with r_cte as (
select min(Date) as min_date, dateadd(day,datediff(day,max(date),eomonth(max(date))),max(date)) as max_date from Projects.dbo.order_dates
UNION ALL
select dateadd(day,1,min_date) as min_date, max_date from r_cte where min_date
Do you have any Idea about how can I get all leetcode questions for free?
DM me on LinkedIn
what is the meaning of leet code? please can u explain me?
It's a website
Solution in pyspark using udf since recursive CTE is not supported:
from pyspark.sql import SparkSession
from pyspark.sql.types import *
from pyspark.sql.functions import *
import os
import findspark
findspark.init('C:/spark-3.3.1-bin-hadoop3/spark-3.3.1-bin-hadoop3/')
def generate_list(start,end):
date_list=[]
c=0
month_list={1:31,2:28,3:31,4:30,5:31,6:30,7:31,8:31,9:30,10:31,11:30,12:31}
start_year=start[0:4]
start_month=start[5:7]
end_year=end[0:4]
end_month=end[5:7]
start_date=start[8:10]
end_date=end[8:10]
if(start_year==end_year and start_month==end_month): # for same year same month
for i in range(int(start_date),int(end_date)+1):
date=str(start_year)+'-'+str(start_month)+'-'+str(i)
date_list.append(date)
elif(start_year==end_year and start_month!=end_month): # for same year diff month
while(int(start_month)
Only leap years are to be incorporated.
WITH cte AS
(SELECT product_id,
period_start,
period_end,
average_daily_sales
FROM sales
UNION ALL SELECT product_id,
DATEADD(DAY, 1, period_start) AS period_start,
period_end,
average_daily_sales
FROM cte
WHERE period_start
with recursive sale_dates as
(
select product_id, period_start as sale_date from sales
union all
select sale_dates.product_id, sale_date + 1 as sale_date from sale_dates, sales
where sale_date < sales.period_end and sale_dates.product_id = sales.product_id
),
sale_year_days as
(select product_id, date_part('year', sale_date) as sale_year, count(1) as sale_year_days
from
sale_dates
group by product_id, date_part('year', sale_date))
select s.product_id, sale_year as report_year, sale_year_days * average_daily_sales as total_amount
from
sale_year_days syd,
sales s
where
syd.product_id = s.product_id
order by product_id, report_year
VERY TOUGH
Please review the below code with out join and generating 100s of lines,
which might be a problem if we have a huge difference bw start and end date
with cte as (select product_id,period_start,period_end,average_daily_sales,
year(period_start) as year_start,
year(period_start) as year_end
from sales
union all
select product_id,period_start,period_end,average_daily_sales,
year_start+1,
year_start+1
from cte
where year_start
with recursive cte as (
select *, year(period_start) end_year, 1 as cnt
from sales
union all
select cte.product_id, cte.period_start, cte.period_end, cte.average_Daily_sales, (year(period_start)+cnt) end_year, cte.cnt+1 as cnt
from cte where (year(period_start)+cnt)