This is great Ankit! This was my approach select a.*, lead(prod_repeat,1) over(partition by userid order by userid) as next_prod, case when (lead(prod_repeat,1) over(partition by userid order by userid)) = prod_repeat then 'Y' else 'N' end as repeat_cx from (select *, rank() over(partition by userid order by userid,productid) as prod_repeat from purchase_history order by userid) as a
Hi Ankit...Thanks for the content My approach WITH all_data AS( SELECT *,DENSE_RANK()OVER(PARTITION BY userid,productid ORDER BY purchasedate ASC) AS rn FROM purchase_history) SELECT userid FROM all_data GROUP BY userid HAVING max(rn)=1 AND count(distinct purchasedate)>1
@@ankitbansal6 Hi Ankit, What if a user bought a product only once, that time filter in having clause which says more than one date fails the record isnt it? My solution considering customer purchased for more than one day and different product Select p1.userid as users from purchase_history p1 left join purchase_history p2 on p1.userid =p2.userid and p1.productid = p2.productid and p1.purchasedate p2.purchasedate group by p1.userid having sum(case when p2.userid is null then 0 else 1 end) = 0 and count(distinct p1.purchasedate)>1
Hi Ankit, Thanks for the question. I really liked your approach however I tried using two ctes. with cte_samepdoduct as ( /* Using denserank, we can get those same products which have been bought on different dates, so that we can filter it out later */ select *,DENSE_RANK() over(partition by userid,productid order by productid,purchasedate) as drn from purchase_history ) , cte_purchasecount as ( /* In this cte, we are counting distinct purchase date so that we can filter those userid later */ select userid,count(distinct Purchasedate) as days_cnt from purchase_history group by userid ) Select userid from cte_purchasecount where days_cnt > 1 and userid not in ( select userid from cte_samepdoduct where drn>1 )
Bit more optimized: (Came with the similar solution, good to see more people use window functions) WITH criteria AS( SELECT userid, DENSE_RANK() OVER (PARTITION BY userid, productid ORDER BY purchasedate) as sameproduct, DENSE_RANK() OVER (PARTITION BY userid ORDER BY purchasedate) as differentdays FROM purchase_history ) SELECT userid FROM criteria GROUP BY userid HAVING MAX(sameproduct) = 1 AND MAX(differentdays) > 1 But DENSE_RANK() should be the answer. Remember though, this isnt ANSI SQL.
Thank you for another useful video. select userid from purchase_history group by userid having count(1)=count(distinct productid) and count(distinct purchasedate)>1 Sorry i solved first put the comment and then watch the video.we have the same solution.We think the same. All the best
My approach is as followed: with ctee as( select userid from purchase_history group by userid having count(distinct purchasedate)=1), cte as( select userid, count(productid) as cnt from purchase_history group by userid, productid having count(productid)>1 ) select distinct userid from purchase_history where userid not in (select userid from cte union select userid from ctee)
select userid from purchase_history where userid not in (select userid from (select p.*, row_number() over(partition by userid, productid ) as rn from purchase_history p ) where rn > 1 ) group by userid having count(distinct(purchasedate))>1
If someone ordered 2 units of product 1 on day one and 1 unit of product 2 on day two, then the product count will be 3, and the distinct product count will be 2. Technically he purchased different products on different days but the logic we applied in this video will rule out that user. Kindly tell me if I'm applying the wrong logic. If my logic is right, then we've to apply these conditions in the Having clause: COUNT(DISTINCT purchasedate) > 1 AND (COUNT(productid) = COUNT(DISTINCT productid) OR COUNT(DISTINCT productid) >= COUNT(DISTINCT purchasedate))
Thanks learnt a lot from you. Your is such a clean code- Mine got too messy- with cte as( Select *, row_number() over(partition by userid, productid order by userid) as rn, lead(purchasedate,1) over(partition by userid) as ld, Case when purchasedate != lead(purchasedate,1) over(partition by userid) then 1 end as fp from purchase_history) Select userid from cte where userid not in (Select userid from cte where rn>1) group by userid having count(fp) >=1 ;
Hi Ankit, your approach is better than mine. But I am sharing my approach anyway: In cte I am finding for each puchchase, whether the productid matches with that customer's any previous purchase productid, if 'no' then that productid will be new for that customer. Then in main query, first discarding customers who puchased only in one day. Then, I am counting total number of product puchases made by each customers and number of new kind of product purchases by them. If these two values are equal then only selecting that customer. The PostgreSQL syntax is as follows: with cte as( SELECT userid, productid, purchasedate, DENSE_RANK() over(PARTITION by userid ORDER by purchasedate) as purchase_day_serial_num case WHEN productid in (SELECT productid from purchase_history p2 where p2.userid=p1.userid and p2.purchasedate1 AND COUNT(userid)=COUNT(case when new_product='yes' then 1 END);
Awesome video as always. Just a small doubt , if one userid bought a product twice in the same day? Example here- What is User _Id 2 purchased product_Id 2 for the second time on 2012-01-23 instead of 2012-01-25. Then in this case User_Id 2 purchased different products in different days as well. Please clarify !
How come product_id 2 is unique here? Question says we need to identify unique products purchased on different days. So product_id 2 is not valid here.
I think this will be the simplest solution:: with cte as(select userid,productid,purchasedate, dense_rank() over(partition by userid order by productid) as rn from purchase_history) select userid from cte group by userid having max(rn) = count(rn) and count(distinct(purchasedate)) >=2 ::
with cte as( select *,row_number() over(partition by userid,productid order by purchasedate) as rn FROM purchase_history) select userid from cte group by userid having max(rn)=1 and count(distinct purchasedate)>1
with cte as( select *,count(1) over(partition by userid,productid) as cnt from purchase_history), cte2 as( select * from cte where userid not in (select userid from cte where cnt>1)) select userid from cte2 group by userid having count(distinct purchasedate)>1;
My solution with cte as ( select *, row_number() over(partition by userid, productid) as rn from purchase_history ), cte2 as ( select distinct userid from cte where rn = 2 ) select userid from purchase_history where userid not in (select * from cte2) group by userid having count(distinct purchasedate) > 1
select userid from (select *,row_number() over(partition by userid,productid order by purchasedate)-1 pr from purchase_history) group by 1 having sum(pr) = 0 and count(distinct purchasedate)>1
My approach : select userid, count(distinct purchasedate) as dist_purdt, count(productid) as total_prods, count(distinct productid) as total_dist_prods from ( select userid, productid, purchasedate from purchase_history group by userid, productid, purchasedate) group by userid having count(distinct purchasedate) > 1 and count(productid) = count(distinct productid) ;
HI @Ankit , This is the other way of implementation select userid,max(product_repeat_purchase) as product_repeat_purchase from (select *,rank() over (partition by userid,productid order by purchasedate ) as product_repeat_purchase from purchase_history where userid in (select * from (select userid from purchase_history group by userid having count(distinct purchasedate) > 1))) as a group by userid having max(product_repeat_purchase) = 1; userid | product_repeat_purchase
select * from ( select userid, count(distinct purchasedate) as dist_date, count(productid) as PC, count(distinct productid) as DPC from purchase_history group by 1) where dist_date >1 and PC=DPC
Question is little bit misleading I would say. Your solution is also allowing the user who has purchased "different products on the same day". For eg: userid 1 For "different products on different days" only userid 4 would be correct as it has unique date and productid: select userid from purchase_history group by userid having count(distinct purchasedate) > 1 and count(distinct productid) = count(productid) and count(distinct purchasedate) = count(purchasedate);
Hi Ankit, i love your videos. Thankyou very much for all the efforts you are keeping to make these videos and helping the Data community. Please chek my solution with cte1 as ---This cte returns, users who bought same product more than once ( select userid,productid, count(1) as tot from purchase_history group by userid, productid having count(1) > 1 ) select distinct userid from purchase_history where userid not in (select userid from cte1) --Filtering users who bought same product more than once and userid not in ( select userid ----Filtering users who dont buy on different dates from purchase_history group by userid having count(distinct purchasedate) = 1 )
What if a user purchase same product more than once on the same day and next two different products on next two days. He is an eligible customer who bought distinct products on distinct days. But when we do count(product_id) = count(distinct product_id) this user is not accounted for..please clarify
I tried to do it with cte and lead function :- with cte as ( select *, lead(productid) over (partition by userid order by productid) as next_purchase from purchase_history where userid in (select userid from purchase_history group by userid having count(distinct(purchasedate)) >1) ) select distinct(userid) from cte where userid not in (select userid from cte where productid = next_purchase);
Hello Ankit I solved this question using different approach, but may not be easiest way 😁😁 : with cte as (select userid,count(distinct purchasedate)cnt from purchase_history group by userid,purchasedate), cte1 as ( select userid from cte group by userid having sum(cnt)>=2),cte2 as( select p.userid,p.productid,p.purchasedate from cte1 c join purchase_history p on c.userid = p.userid),cte3 as (select userid from cte2 group by userid,productid having count(*) = 2) select distinct userid from cte2 where userid not in (select * from cte3)
Biggest challenge is Amazon Interviews will ask us to write the SQL code in notepad with no option to run and see these data transformations at different stages :)
with cte as ( select userid, count(distinct purchasedate) as count_purchasedate, count(productid) as new_p, count(distinct productid) as new_dp from purchase_history group by userid having count_purchasedate > 1) select case when new_p = new_dp then userid end as userid from cte where (case when new_p = new_dp then userid end) is not null
with cte as ( select *,count(productid) over(partition by productid,userid) as pro_count, count( purchasedate) over (partition by purchasedate,userid) as date_count from purchase_history ) select distinct userid from cte where pro_count = 1 and Date_count = 1 getting optimum output , is there any wrong in this
with du as (select count(productid) over (partition by userid,purchasedate order by purchasedate) ctn, * from purchase_history ) select userid from du where ctn=1 group by 1
Hi Ankit, Thanks for the value you are creating. Your videos really help us. I think this solution is not matching with question. What happens if a customer buy same product on same day? Ex: Userid 1 bought productid 1 twice on 2012-01-23. As the question is about customers not buying same product on different day but it is okay to buy twice on same day.
Hi Sir, The question states that products purchased on any given day are not repeated on other day .. What if a person purchases the same product twice on the same day. This is not been taken into account
Can you please make a video on often asked postgreSQL function questions in interviews, such as COALESCE, EPOCH, INTERVAL, EXIST etc, if possible.. Or you can solve a postgreSQL interview question which includes either one or more of the above mentioned functions..
Hi Ankit, does this make sense? with tab as (select userid,productid,count(distinct purchasedate) as cn from purchase_history group by userid,productid) select distinct userid from purchase_history where userid not in (select userid from tab where cn>1) and userid in (select userid from purchase_history group by userid having count(distinct purchasedate)>1)
There is one assumption here that there won't be same product for a single date for a user, in that case this will produce wrong output. This solution will works in the above case also. with purchase_history_compared AS ( -- join each row with a different transaction -- which is not on samedate and is done by same user -- It will also filter out userid who don't have 2 purchases SELECT a.userid, a.productid, b.productid as next_productid FROM practice.purchase_history a JOIN practice.purchase_history b ON a.userid = b.userid and a.purchasedate != b.purchasedate ) , repeat_flag_table AS( -- mark those transaction as repeated where the product matches SELECT userid, CASE WHEN next_productid = productid THEN 0 ELSE 1 END as repeat_flag FROM purchase_history_compared ) SELECT -- Filter those users who have all non repeated transactions userid FROM repeat_flag_table GROUP BY userid HAVING SUM(repeat_flag) = COUNT(1)
How about this query with cte as ( select *, row_number() over(partition by userid) as rn from purchase_history) select distinct userid from cte where userid not in(select userid from cte where productid != rn);
Hi sir i am currently working in amazon and preparing for Business analyst role, can you please make a video on comparison between Temporary table and CTE i tried finding the difference but not able to find significant change or justifiable answer please help me with this topic.
Great Video, but the solution given would not work in the scenario if same product is bought more than once on the same day. The following query would work in that situation as well. WITH cte_1 AS ( SELECT *, LAG(productid) OVER(PARTITION BY userid ORDER BY productid) AS prev_product, LAG(purchasedate) OVER(PARTITION BY userid ORDER BY productid) AS prev_date FROM purchase_history ) SELECT userid FROM cte_1 c1 GROUP BY userid HAVING SUM(IF(productid = prev_product AND purchasedate prev_date, 1,0)) = 0 AND COUNT(DISTINCT purchasedate) > 1
with du as (select count(productid) over (partition by userid,purchasedate order by purchasedate) ctn, count( purchasedate) over (partition by userid,purchasedate order by userid) ctn1, * from purchase_history ) select userid from du where ctn=1 and ctn1=1 group by 1
@@ankitbansal6 In that case the query will return user id 3 as well . So a better solution would be to use Union and select userid from cte where Total_Products=1
Hi Ankit Sir, Can you please give a verdict on the solution? with cte1 as ( select a.* from purchase_history as a left join ( select userid , productid , count(1) as product_count from purchase_history group by userid , productid having count(1) > 1 ) as b on a.userid = b.userid where b.userid is null ) select userid from cte1 group by userid having count(distinct purchasedate) > 1;
Hi ankit. Iam waiting for past few months for your course. fees charges are higher and normal individual cannot afford. please reconsider the price. I emailed you ankit. THANKYOU . ALL THE BEST. NEW TOPICS VERY INTERESTING.
Hi Ankit , I used a concept from your previous videos,let me know your comments. select userid from (select *, lag(productid,1) over (partition by userid order by purchasedate) as prevproduct, lag(purchasedate,1) over (partition by userid order by purchasedate) as predate from purchase_history) A group by userid having count(distinct purchasedate) != count(distinct prevdate) and count(distinct purchasedate) >= 2;
select userid from ( select userid, diff from ( select *, productid-rw as diff from ( select *, row_number () over (partition by userid order by purchasedate ) as rw from purchase_history) a ) b group by userid, diff) c group by userid having count(*)=1
@ankit bansal check this out Select distinct userid, productid, purchaseddate From purchase_history Simple solution why u always given tough sols to confuse people?
My Solution using self join: select a.userid from purchase_history a inner join purchase_history b on a.userid = b.userid and a.purchasedate > b.purchasedate group by a.userid having count(distinct case when a.productid = b.productid then 1 else 0 end)=1 and max(case when a.productid = b.productid then 1 else 0 end)=0
select distinct a.userid from purchase_history a join purchase_history b on a.userid = b.userid and a.purchasedate < b.purchasedate where a.productid != b.productid and a.userid not in (select a.userid from purchase_history a join purchase_history b on a.userid = b.userid where a.productid = b.productid and a.purchasedate < b.purchasedate)
select userid from (select userid, case when purchasedate != next_date and productid != next_prod then userid end as diff from (select *, lead(productid) over (partition by userid) next_prod,lead(purchasedate) over(partition by userid) next_date from (select * from purchase_history where userid in (select userid from (select *, dense_rank()over (partition by userid order by purchasedate)rnk from purchase_history)d where rnk > 1))k)L)o where diff is not null; Thanks Ankit;
with cte_NoRepeate as ( SELECT [userid] , productid, count(distinct purchasedate) as distinctDays FROM [dbo].[purchase_history] where userid in ( select userid from [dbo].[purchase_history] group by userid having count(distinct purchasedate)>1 ) group by userid ,productid ) select userid from cte_NoRepeate group by userid having count(userid)=sum(distinctDays)
Bhaiyya my friend suggested to me your channel, So this is first time I am landing you channel To upskill my SQL ... But I am not able to understand how should I should watch your channel in order. Can you pls show the path.?
WITH CTE AS ( SELECT l.userid, l.productid AS lpro, r.productid AS ppro FROM purchase_history l JOIN purchase_history r ON l.userid = r.userid AND l.purchasedate
Hello Ankit Sir Thanks for the amazing video My solution for the same is ;with cte as ( select p1.userid from purchase_history p1 inner join purchase_history p2 on p1.userid=p2.userid and p1.purchasedate!=p2.purchasedate and p1.productid=p2.productid) select MAX(userid) as userid from purchase_history where userid not in (select * from cte ) group by userid having count(distinct purchasedate)>1
My solution using 2 CTE's but the showed solution is much more optimized !!! WITH CTE AS(SELECT USERID,COUNT(DISTINCT PRODUCTID) AS COUNT_DISTINCT_PID FROM purchase_history GROUP BY 1 HAVING COUNT(DISTINCT PURCHASEDATE)>1), CTE_2 AS(SELECT USERID,COUNT(PRODUCTID) AS COUNT_PID FROM purchase_history GROUP BY 1 HAVING COUNT(DISTINCT PURCHASEDATE)>1) SELECT CTE.USERID FROM CTE JOIN CTE_2 ON CTE.USERID=CTE_2.USERID AND CTE.COUNT_DISTINCT_PID=CTE_2.COUNT_PID;
with cte as ( select userid from ( select userid, purchasedate from purchase_history group by userid, purchasedate ) a group by userid having count(*) > 1 ), cte1 as ( select p.userid, p.productid, p.purchasedate from purchase_history p join cte c on c.userid = p.userid ) select userid from cte1 minus select c.userid -- users who bought the same product on two different dates from cte1 c join cte1 c1 on c.userid = c1.userid and c.productid = c1.productid and c.purchasedate < c1.purchasedate;
This is great Ankit!
This was my approach
select a.*,
lead(prod_repeat,1) over(partition by userid order by userid) as next_prod,
case when (lead(prod_repeat,1) over(partition by userid order by userid)) = prod_repeat then 'Y' else 'N' end as repeat_cx
from (select *,
rank() over(partition by userid order by userid,productid) as prod_repeat
from purchase_history
order by userid) as a
select userid from purchase_history group by 1 having count(distinct productid) =count(productid) and count(distinct purchasedate)>1
Hi Ankit...Thanks for the content
My approach
WITH all_data AS(
SELECT *,DENSE_RANK()OVER(PARTITION BY userid,productid ORDER BY purchasedate ASC) AS rn
FROM purchase_history)
SELECT userid
FROM all_data
GROUP BY userid
HAVING max(rn)=1 AND count(distinct purchasedate)>1
Good one 👍
@@ankitbansal6
Hi Ankit,
What if a user bought a product only once, that time filter in having clause which says more than one date fails the record isnt it?
My solution considering customer purchased for more than one day and different product
Select p1.userid as users
from purchase_history p1
left join purchase_history p2
on p1.userid =p2.userid
and p1.productid = p2.productid and p1.purchasedate p2.purchasedate
group by p1.userid
having sum(case when p2.userid is null then 0 else 1 end) = 0 and count(distinct p1.purchasedate)>1
Hi Ankit, Thanks for the question. I really liked your approach however I tried using two ctes.
with cte_samepdoduct as
(
/* Using denserank, we can get those same products which
have been bought on different dates, so that we can
filter it out later */
select *,DENSE_RANK() over(partition by userid,productid order by productid,purchasedate) as drn
from purchase_history
)
, cte_purchasecount as (
/* In this cte, we are counting distinct purchase
date so that we can filter those userid later */
select userid,count(distinct Purchasedate) as days_cnt
from purchase_history
group by userid
)
Select userid from cte_purchasecount
where days_cnt > 1 and
userid not in ( select userid from cte_samepdoduct where drn>1 )
Bit more optimized: (Came with the similar solution, good to see more people use window functions)
WITH criteria AS(
SELECT userid,
DENSE_RANK() OVER (PARTITION BY userid, productid ORDER BY purchasedate) as sameproduct,
DENSE_RANK() OVER (PARTITION BY userid ORDER BY purchasedate) as differentdays
FROM purchase_history
)
SELECT userid
FROM criteria
GROUP BY userid
HAVING MAX(sameproduct) = 1 AND MAX(differentdays) > 1
But DENSE_RANK() should be the answer. Remember though, this isnt ANSI SQL.
Thank you for another useful video.
select userid from purchase_history
group by userid
having count(1)=count(distinct productid) and count(distinct purchasedate)>1
Sorry i solved first put the comment and then watch the video.we have the same solution.We think the same.
All the best
Excellent👏
simply supper
My approach is as followed:
with ctee as(
select userid
from purchase_history
group by userid
having count(distinct purchasedate)=1),
cte as(
select userid, count(productid) as cnt from purchase_history
group by userid, productid
having count(productid)>1
)
select distinct userid from purchase_history where userid not in
(select userid from cte
union
select userid from ctee)
select userid from purchase_history where userid not in (select userid from (select p.*, row_number() over(partition by userid, productid ) as rn from purchase_history p ) where rn > 1 ) group by userid having count(distinct(purchasedate))>1
If someone ordered 2 units of product 1 on day one and 1 unit of product 2 on day two, then the product count will be 3, and the distinct product count will be 2. Technically he purchased different products on different days but the logic we applied in this video will rule out that user. Kindly tell me if I'm applying the wrong logic.
If my logic is right, then we've to apply these conditions in the Having clause:
COUNT(DISTINCT purchasedate) > 1 AND (COUNT(productid) = COUNT(DISTINCT productid) OR COUNT(DISTINCT productid) >= COUNT(DISTINCT purchasedate))
Hi
Did you get the clarification?
I also have same doubt @ankit please clarify
Thanks learnt a lot from you.
Your is such a clean code- Mine got too messy-
with cte as(
Select *,
row_number() over(partition by userid, productid order by userid) as rn,
lead(purchasedate,1) over(partition by userid) as ld,
Case when purchasedate != lead(purchasedate,1) over(partition by userid) then 1 end as fp
from purchase_history)
Select userid from cte
where userid not in (Select userid from cte where rn>1)
group by userid
having count(fp) >=1
;
Hi Ankit, your approach is better than mine. But I am sharing my approach anyway:
In cte I am finding for each puchchase, whether the productid matches with that customer's any previous purchase productid, if 'no' then that productid will be new for that customer.
Then in main query, first discarding customers who puchased only in one day. Then, I am counting total number of product puchases made by each customers and number of new kind of product purchases by them. If these two values are equal then only selecting that customer.
The PostgreSQL syntax is as follows:
with cte as(
SELECT userid, productid, purchasedate,
DENSE_RANK() over(PARTITION by userid ORDER by purchasedate) as purchase_day_serial_num
case WHEN productid in (SELECT productid from purchase_history p2 where p2.userid=p1.userid and
p2.purchasedate1 AND COUNT(userid)=COUNT(case when new_product='yes' then 1 END);
Hi Ankit, Thank you for everything that you are doing. Can you pls post videos on Procedures, views, functions as well?
Noted
Awesome video as always. Just a small doubt , if one userid bought a product twice in the same day?
Example here- What is User _Id 2 purchased product_Id 2 for the second time on 2012-01-23 instead of 2012-01-25. Then in this case User_Id 2 purchased different products in different days as well. Please clarify !
How come product_id 2 is unique here? Question says we need to identify unique products purchased on different days. So product_id 2 is not valid here.
I think this will be the simplest solution::
with cte as(select userid,productid,purchasedate, dense_rank() over(partition by userid order by productid) as rn
from purchase_history)
select userid from cte group by userid having max(rn) = count(rn) and count(distinct(purchasedate)) >=2
::
with cte as(
select *,row_number() over(partition by userid,productid order by purchasedate) as rn
FROM purchase_history)
select userid
from cte
group by userid
having max(rn)=1 and count(distinct purchasedate)>1
best SQL channel 🔥
Thank you 😊
with cte as(
select *,count(1) over(partition by userid,productid) as cnt from purchase_history),
cte2 as(
select * from cte where userid not in (select userid from cte where cnt>1))
select userid from cte2 group by userid having count(distinct purchasedate)>1;
My solution
with cte as
(
select *,
row_number() over(partition by userid, productid) as rn
from purchase_history
),
cte2 as
(
select distinct userid
from cte
where rn = 2
)
select userid
from purchase_history
where userid not in (select * from cte2)
group by userid
having count(distinct purchasedate) > 1
select userid
from
(select *,row_number() over(partition by userid,productid order by purchasedate)-1 pr
from purchase_history)
group by 1
having sum(pr) = 0 and count(distinct purchasedate)>1
My approach :
select userid, count(distinct purchasedate) as dist_purdt,
count(productid) as total_prods, count(distinct productid) as total_dist_prods
from (
select userid, productid, purchasedate from purchase_history
group by userid, productid, purchasedate) group by userid
having count(distinct purchasedate) > 1
and count(productid) = count(distinct productid) ;
HI @Ankit ,
This is the other way of implementation
select userid,max(product_repeat_purchase) as product_repeat_purchase from (select *,rank() over (partition by userid,productid order by purchasedate ) as product_repeat_purchase from purchase_history where userid in (select * from (select userid from purchase_history group by userid having count(distinct purchasedate) > 1))) as a group by userid having max(product_repeat_purchase) = 1;
userid | product_repeat_purchase
select * from (
select userid,
count(distinct purchasedate) as dist_date,
count(productid) as PC,
count(distinct productid) as DPC
from purchase_history
group by 1)
where dist_date >1 and PC=DPC
Question is little bit misleading I would say. Your solution is also allowing the user who has purchased "different products on the same day". For eg: userid 1
For "different products on different days" only userid 4 would be correct as it has unique date and productid:
select userid from purchase_history group by userid having count(distinct purchasedate) > 1 and count(distinct productid) = count(productid) and count(distinct purchasedate) = count(purchasedate);
Hi Ankit, i love your videos. Thankyou very much for all the efforts you are keeping to make these videos and helping the Data community.
Please chek my solution
with cte1 as ---This cte returns, users who bought same product more than once
(
select userid,productid, count(1) as tot
from purchase_history
group by userid, productid
having count(1) > 1
)
select distinct userid
from purchase_history
where userid not in (select userid from cte1) --Filtering users who bought same product more than once
and userid not in ( select userid ----Filtering users who dont buy on different dates
from purchase_history
group by userid
having count(distinct purchasedate) = 1 )
What if a user purchase same product more than once on the same day and next two different products on next two days. He is an eligible customer who bought distinct products on distinct days. But when we do count(product_id) = count(distinct product_id) this user is not accounted for..please clarify
But for that particular day product_id should be consider only once if he has bought that product twice or thrice whatever.
Hi Ankit!
Thanks for creating such a Unique Content.
can you please execute below query and help us :)
create playlist SQL_Basic_to_Advance;
I tried to do it with cte and lead function :-
with cte as (
select *,
lead(productid) over (partition by userid order by productid) as next_purchase
from purchase_history
where userid in (select userid from purchase_history group by userid having count(distinct(purchasedate)) >1)
)
select distinct(userid)
from cte
where
userid not in (select userid from cte where productid = next_purchase);
Hello Ankit I solved this question using different approach, but may not be easiest way 😁😁 :
with cte as (select userid,count(distinct purchasedate)cnt from purchase_history
group by userid,purchasedate),
cte1 as (
select userid from cte
group by userid
having sum(cnt)>=2),cte2 as(
select p.userid,p.productid,p.purchasedate from cte1 c
join purchase_history p
on c.userid = p.userid),cte3 as
(select userid from cte2
group by userid,productid
having count(*) = 2)
select distinct userid from cte2 where userid not in (select * from cte3)
Biggest challenge is Amazon Interviews will ask us to write the SQL code in notepad with no option to run and see these data transformations at different stages :)
with cte as (
select userid, count(distinct purchasedate) as count_purchasedate, count(productid) as new_p, count(distinct productid) as new_dp
from purchase_history
group by userid
having count_purchasedate > 1)
select case when new_p = new_dp then userid end as userid
from cte
where (case when new_p = new_dp then userid end) is not null
with cte as (
select *,count(productid) over(partition by productid,userid) as pro_count,
count( purchasedate) over (partition by purchasedate,userid) as date_count
from purchase_history
)
select distinct userid from cte
where pro_count = 1 and Date_count = 1
getting optimum output , is there any wrong in this
Great question brother you are doing great work
Thanks a ton
with du as (select count(productid) over (partition by userid,purchasedate order by purchasedate) ctn, * from purchase_history )
select userid from du where ctn=1 group by 1
Hi Ankit, Thanks for the value you are creating. Your videos really help us.
I think this solution is not matching with question. What happens if a customer buy same product on same day? Ex: Userid 1 bought productid 1 twice on 2012-01-23. As the question is about customers not buying same product on different day but it is okay to buy twice on same day.
Hi Sir,
The question states that products purchased on any given day are not repeated on other day .. What if a person purchases the same product twice on the same day. This is not been taken into account
Thanks as always!
Could you please make a video on how to optimize the SQL queries with real examples?
I faced this in real interview. Could not solve it. :(
simple but tricky... cool question💙❤
Doss CTE maximises the execution time of the query ?
Thanks Brother for the Continues great Work.
Just asking if you can start any GCP Engineering concepts also like Dataflow, composer
Can you please make a video on often asked postgreSQL function questions in interviews, such as COALESCE, EPOCH, INTERVAL, EXIST etc, if possible..
Or you can solve a postgreSQL interview question which includes either one or more of the above mentioned functions..
Sure
Hi Ankit, does this make sense?
with tab as (select userid,productid,count(distinct purchasedate) as cn from purchase_history
group by userid,productid)
select distinct userid from purchase_history where userid not in (select userid from tab
where cn>1) and userid in (select userid from purchase_history group by userid having
count(distinct purchasedate)>1)
There is one assumption here that there won't be same product for a single date for a user, in that case this will produce wrong output.
This solution will works in the above case also.
with purchase_history_compared AS (
-- join each row with a different transaction
-- which is not on samedate and is done by same user
-- It will also filter out userid who don't have 2 purchases
SELECT a.userid, a.productid, b.productid as next_productid
FROM practice.purchase_history a
JOIN practice.purchase_history b
ON a.userid = b.userid and a.purchasedate != b.purchasedate
)
, repeat_flag_table AS(
-- mark those transaction as repeated where the product matches
SELECT
userid,
CASE WHEN next_productid = productid THEN 0
ELSE 1
END as repeat_flag
FROM purchase_history_compared
)
SELECT
-- Filter those users who have all non repeated transactions
userid
FROM repeat_flag_table
GROUP BY userid
HAVING SUM(repeat_flag) = COUNT(1)
How about this query
with cte as (
select *,
row_number() over(partition by userid) as rn
from purchase_history)
select distinct userid from cte where userid not in(select userid from cte where productid != rn);
Product IDs can be 100,200 . You need to write a generic solution.
Sir, what if a user buys the same product twice in a day. Then the output would not be correct ?
then first we will distinct it
Hi sir i am currently working in amazon and preparing for Business analyst role, can you please make a video on comparison between Temporary table and CTE i tried finding the difference but not able to find significant change or justifiable answer please help me with this topic.
Temporary table are for particular session , cte is used as permanent table for all sessions
Great video sir💯
Glad you liked it
Great Video, but the solution given would not work in the scenario if same product is bought more than once on the same day. The following query would work in that situation as well.
WITH cte_1 AS
(
SELECT *, LAG(productid) OVER(PARTITION BY userid ORDER BY productid) AS prev_product,
LAG(purchasedate) OVER(PARTITION BY userid ORDER BY productid) AS prev_date
FROM purchase_history
)
SELECT userid
FROM cte_1 c1
GROUP BY userid
HAVING SUM(IF(productid = prev_product AND purchasedate prev_date, 1,0)) = 0 AND COUNT(DISTINCT purchasedate) > 1
nice explanation 👍👍
Keep watching😊
with du as (select count(productid) over (partition by userid,purchasedate order by purchasedate) ctn,
count( purchasedate) over (partition by userid,purchasedate order by userid) ctn1,
* from purchase_history )
select userid from du where ctn=1 and ctn1=1 group by 1
Hey Ankit. It was a great question.. can you do some more questions based on Amazon bi role. Thank you..
Yes, definitely
Hi Ankit, if we have a customer who has bought just 1 product on a given date and we want to include them also, then this might not work.
You can just remove the first condition of distinct purchasedate> 1
Then it will work
@@ankitbansal6 In that case the query will return user id 3 as well .
So a better solution would be to use Union and select userid from cte where Total_Products=1
I am using mysql, I seem to be stuck on insert step. How do I change the date format in the session for mysql. Anyone can guide?
You can change the insertion script.
@@ankitbansal6 yes, I did that.
Hi Ankit Sir, Can you please give a verdict on the solution?
with cte1 as
(
select a.* from purchase_history as a left join
(
select userid , productid , count(1) as product_count from purchase_history group by userid , productid
having count(1) > 1
) as b on a.userid = b.userid where b.userid is null
)
select userid from cte1 group by userid
having count(distinct purchasedate) > 1;
Good one
Hi Ankit, messaged u in linkedin regarding a query.. can u plz post that..
u can post here as well
Hi ankit. Iam waiting for past few months for your course. fees charges are higher and normal individual cannot afford. please reconsider the price. I emailed you ankit. THANKYOU . ALL THE BEST. NEW TOPICS VERY INTERESTING.
Don't worry
Hi Ankit , I used a concept from your previous videos,let me know your comments.
select userid from
(select *,
lag(productid,1) over (partition by userid order by purchasedate) as prevproduct,
lag(purchasedate,1) over (partition by userid order by purchasedate) as predate
from purchase_history) A
group by userid
having count(distinct purchasedate) != count(distinct prevdate) and count(distinct purchasedate) >= 2;
Why lag function you are using ? We need to check across all the orders from a customer
@@ankitbansal6 okay, but in which test case this will fail?
Hello Ankit Is namaste SQL your website
I would like to register for your course
Yes it's mine
@@ankitbansal6 Thank You Sir
Which course would you suggest I apply for as I have basic skills in SQL
@@kunalparihar8665 content is similar in both
@@ankitbansal6 okay
logic man
select userid from (
select userid, diff from (
select *, productid-rw as diff from (
select *, row_number () over (partition by userid order by purchasedate ) as rw
from purchase_history) a ) b
group by userid, diff) c
group by userid
having count(*)=1
@ankit bansal check this out
Select distinct userid, productid, purchaseddate
From purchase_history
Simple solution why u always given tough sols to confuse people?
But it will give all the users
@whosac
You'll get all 4 users with your query which is wrong
When you are wrong but have humongous confidence 🤣
i want this level of confidence in my life 😁😁
My Solution using self join:
select a.userid
from purchase_history a
inner join purchase_history b
on a.userid = b.userid and a.purchasedate > b.purchasedate
group by a.userid
having count(distinct case when a.productid = b.productid then 1 else 0 end)=1 and max(case when a.productid = b.productid then 1 else 0 end)=0
select distinct a.userid
from purchase_history a
join purchase_history b
on a.userid = b.userid and a.purchasedate < b.purchasedate
where a.productid != b.productid and a.userid not in (select a.userid
from purchase_history a
join purchase_history b
on a.userid = b.userid
where a.productid = b.productid and a.purchasedate < b.purchasedate)
select userid from
(select userid, case when purchasedate != next_date and productid != next_prod then userid end as diff from
(select *, lead(productid) over (partition by userid) next_prod,lead(purchasedate) over(partition by userid) next_date from
(select * from purchase_history where userid in
(select userid from
(select *, dense_rank()over (partition by userid order by purchasedate)rnk from purchase_history)d
where rnk > 1))k)L)o
where diff is not null;
Thanks Ankit;
with cte_NoRepeate as (
SELECT [userid] ,
productid,
count(distinct purchasedate) as distinctDays
FROM [dbo].[purchase_history]
where userid in (
select userid from [dbo].[purchase_history]
group by userid
having count(distinct purchasedate)>1
)
group by userid ,productid
)
select userid from cte_NoRepeate
group by userid
having count(userid)=sum(distinctDays)
Bhaiyya my friend suggested to me your channel,
So this is first time I am landing you channel
To upskill my SQL ...
But I am not able to understand how should I should watch your channel in order.
Can you pls show the path.?
Start with SQL tips and tricks playlist
WITH CTE AS (
SELECT
l.userid,
l.productid AS lpro,
r.productid AS ppro
FROM
purchase_history l
JOIN purchase_history r ON l.userid = r.userid
AND l.purchasedate
Hello Ankit Sir Thanks for the amazing video
My solution for the same is
;with cte as (
select p1.userid from purchase_history p1
inner join purchase_history p2 on p1.userid=p2.userid and p1.purchasedate!=p2.purchasedate
and p1.productid=p2.productid)
select MAX(userid) as userid
from purchase_history where userid not in (select * from cte )
group by userid
having count(distinct purchasedate)>1
My solution using 2 CTE's but the showed solution is much more optimized !!!
WITH CTE AS(SELECT USERID,COUNT(DISTINCT PRODUCTID) AS COUNT_DISTINCT_PID FROM purchase_history
GROUP BY 1 HAVING COUNT(DISTINCT PURCHASEDATE)>1),
CTE_2 AS(SELECT USERID,COUNT(PRODUCTID) AS COUNT_PID FROM purchase_history
GROUP BY 1 HAVING COUNT(DISTINCT PURCHASEDATE)>1)
SELECT CTE.USERID FROM CTE JOIN CTE_2 ON CTE.USERID=CTE_2.USERID AND CTE.COUNT_DISTINCT_PID=CTE_2.COUNT_PID;
with cte as (
select userid from
(
select userid, purchasedate
from purchase_history
group by userid, purchasedate
) a
group by userid
having count(*) > 1
),
cte1 as (
select p.userid, p.productid, p.purchasedate
from purchase_history p
join
cte c
on
c.userid = p.userid
)
select userid from cte1
minus
select c.userid
-- users who bought the same product on two different dates
from cte1 c
join
cte1 c1
on
c.userid = c1.userid
and
c.productid = c1.productid
and
c.purchasedate < c1.purchasedate;