Hi Nate, Don't know if you're gonna see this. But I religiously followed your youtube channel and even took a subscription at StrataScratch for an amazon interview for the role of data engineer. I can't believe that I cracked it, just got the mail that I got selected!!! Thank you so much. Everything you taught was put to right use and came in handy during my interviews! THANK YOU SO MUCH OMG ILY
Congrats on your new role! Very happy for you and I'm glad you found the videos useful. Thank you so much for the kind words. Good luck on the start of your new job!
that was a great explanation for a tough problem, Really could not think of all scenarios. I agree there isn't much good quality content of these series out there. Thanks for doing these
Man, thanks for these videos. I'm in an interview process and always get nervous or mix up in tech interviews. These videos are really helpful to structure the questions and the answers! Already subbed!
Thanks for watching and I'm glad you found this channel! I'll keep posting more videos for sure. Let me know if you have any topics you'd like me to cover. I always try to make videos that my audience wants.
Dude, seriously helpful stuff here to walk through your thought process. Very hard to translate practical experience into words, but it was extremely beneficial into breaking this down into digestible parts!
I find your videos very useful but in this case I think your solution is a little more complicated than needed. You could select the user if and their min purchase date and create a table with just what they bought the first time. You can then join on the min date and eliminate all the records of the first purchase date. You then do the same join but using product id. You are then left with a table that exclude for each user their first date purchase and any latter purchase that includes any of the products purchased the first date. The count of distinct user is gives you the result of how many users were reached by your campaign.
Very clear and nice explanation. I'm watching these out of pure curiosity to discover new ways I can approach problems. If problems/explanations on the platform are just as real-world oriented it might be worth to sign up 😏
Thanks for watching! Yes, those questions on the platform are real interview questions from companies. Many of them are straight out of interviews, while others are re-structured but still test for the same concepts. There are a few difficult ones I put on the platform to show you what a few real world questions would look like on the job. Hope they are helpful! There's 50 free questions on the platform so just do those until you want more=)
This one is pretty good. I needed to improve on understanding the logic of solving difficult SQL questions. Will have to practice more. Recap: 1. Identify all scenarios and edge cases. 2. Individually map out solution for each scenario. 3. Keep the logic separate from each other.
what about using CTEs... with fst as ( select user_id, min(created_at) firstDate from marketing_campaign group by user_id), firstOrders as ( select * from marketing_campaign a where exists (select 1 from fst b where b.user_id=a.user_id and b.firstDate=a.created_at)), newOrders as ( select * from marketing_campaign a where not exists (select 1 from firstOrders b where b.user_id=a.user_id and (b.created_at=a.created_at or b.product_id=a.product_id))) select count(distinct user_id) from newOrders
Probably one of the more difficult questions you'd see out there. The last part in handling one of the edge cases was definitely difficult so I hope I explained it right.
Easier solution select count(distinct m.user_id) from marketing_campaign m, (select user_id, min(created_at) from marketing_campaign group by user_id) a where a.user_id = m.user_id and m.product_id not in ( select product_id from marketing_campaign where created_at=a.min and user_id=a.user_id )
I think the reason this is confusing for me to read is that there’s a subquery in the from clause along with a table…..but I’m glad you shared this. It’s always helpful to see the different ways to solve the same problem, keeping in mind efficiency too. I’d also like to know how long it took people like you and Nate to be this good at SQL lol. I’m still new but sometimes it’s frustrating when I don’t learn everything as fast as I’d like to.
I have written a similar code but im getting 1 extra user id (24 instead of 23). Could anyone tell me what is wrong. select count(distinct(a.user_id)) from marketing_campaign a inner join (select user_id,created_at,b.product_id from marketing_campaign b group by 1,2,3 )b on a.user_id = b.user_id and a.created_at < b.created_at and a.product_id b.product_id ;
It's definitely not great phrasing on the problem. I may have reworded it a bit on the platform but it's probably still pretty difficult to really understand. This was a last rounder, in-person interview question that's meant to be comprehensive so I tried my best to word the problem as concisely as possible. Thanks for the feedback and thanks for watching!
QQ Nate: If we remove 1st subsql, then also code should work. What is the purpose to select user_id from the subsql? We should just find those combination of concat(userid,productid) that are bought on 1st day, and just filter out from main table. please suggest.
I believe you can do it that way. My code isn't the most efficient. A lot of that has to do with how I would explain the code to an interviewer. It's sometimes easier to write less efficient code but have the code follow logic step by step, especially if you're trying to walk someone through your solution. The last part of the interviews are usually talking about how to optimize the code, so that's where your version would work.
Nate this is a really helpful video, as are all of these where you work through a real problem. As someone wanting to become a data analyst, I’m wondering how in-depth my SQL interview questions are likely to be, vs those for a data science position. Any idea?
This is a question that is likely for a white boarding session that would take place on an on-site interview. I believe you could finish the problem in 30-min on a white board since it's more collaborative with the interviewer than coding on a text editor via Zoom. =)
With userFirstPurchase(user_id, first_purchase_dt,created_at, product_id) as (select user_id, min(created_at) over (partition by user_id), created_at, product_id from marketing_campaign ) select count(distinct user_id) from userFirstPurchase where created_at > first_purchase_dt and concat(user_id,product_id) not in (select concat(user_id,product_id) from userFirstPurchase where first_purchase_dt=created_at )
How about this? select count(distinct user_id) from marketing_campaign where concat(user_id,'_',product_id) not in( select usr_prd from (select concat(user_id,'_',product_id) usr_prd, created_at, rank() over(partition by user_id order by created_at asc) rnk from marketing_campaign ) up where rnk = 1 ); subquery creates a list of user_id+product_id for all first transactions. I am using rank window function to identify all first transactions. After that using "not in" clause to select all other combinations.
Will it work if instead of concatenating the user_id and product_id column, we just take product_id from the subquery and do product_id NOT IN subquery?
@@stratascratch I got your point about why we need to use CONCAT. If we don't create a new column, we will lose all the required rows as wel because they have the same ID's as well. Very nice question and an even better explanation.
Hi Nate, love your channel, and I have binge-watching your videos for the last 3 days. More power to you, for helping all of us. I came up with this solution using a self join: select COUNT(DISTINCT md1.user_id) from marketing_campaign md1 --day one JOIN marketing_campaign md2 --day two ON md1.user_id=md2.user_id --same user purchasing again AND md1.created_at::date
That's a nice solution. You're missing an edge case where a user cannot purchase the same items as the items in their first purchase. You have user 25 in your output but if you look at the raw data, user 25 purchase both item id 114, 115 during their 1st purchase. And then purchased the same items in the future. So 114 and 115 should not count and thus user 25 should not be counted. My code: concat((user_id),'_', (product_id)) not in (SELECT user_product FROM (SELECT *, rank() over(PARTITION BY user_id ORDER BY created_at) AS rn, concat((user_id),'_', (product_id)) AS user_product FROM marketing_campaign Handles this exact edge case.
Hey Nate, I don't understand why do we have to remove all the users with rank 1 in the second part of the query. In that, not all users buy the same product on a different date!
I'm actually only removing user-product combo for the user's 1st purchase because I don't care about their 1st transaction. If the user purchased the same items on a different date, then those transactions are also removed, which is what we want to do according to the question. What is kept are the products that are different than those the user purchased in a future transactions. This is definitely a difficult question to keep all cases in your head. What I'd do is follow a few of the users that I mentioned in the video. Head over to the platform and for each logic in the query, output the user and products purchased. Then see how specific rules filter out users. Hope that helps!
select count(distinct mc1.user_id) from marketing_campaign mc1 join marketing_campaign mc2 on mc1.user_id=mc2.user_id and mc2.created_at>=mc1.created_at+interval '1 day' and mc2.product_id!=mc1.product_id i am getting 24 and not 23? can anyone tell
My answer; SELECT COUNT(DISTINCT user_id) FROM( SELECT user_id ,created_at ,product_id ,last_order ,DATEDIFF(created_at,last_order) AS date_diff FROM( SELECT user_id ,created_at ,product_id ,LAG(created_at) OVER(PARTITION BY user_id ORDER BY created_at) last_order FROM( SELECT user_id ,created_at ,product_id ,ROW_NUMBER()OVER(PARTITION BY user_id,product_id ORDER BY created_at) AS row_num FROM marketing_campaign)a WHERE row_num = 1)b)c WHERE date_diff > 0
User 50 bought product 118 x4 at a price of 35 Then next day product 118 x4 at a price of 29 Then three weeks later product 118 x5 at a price of 299 Then next day product 118 x2 at a price of 199 Must have been GameStop stocks.
WITH CTE AS ( SELECT user_id, product_id, created_at, RANK() OVER (PARTITION BY user_id ORDER BY created_at) AS rn FROM marketing_campaign ), firstDayPurchases AS ( SELECT * FROM CTE WHERE rn = 1 ), subsequentPurchases AS ( SELECT * FROM CTE WHERE rn > 1 ) SELECT COUNT(DISTINCT user_id) FROM subsequentPurchases WHERE (user_id, product_id) NOT IN (SELECT user_id, product_id FROM firstDayPurchases)
Exists - ensures that the current row is not the first transaction for the user Not exists - ensure that the current product in the row is bought previously Select count(distinct user_id) from Market_Campaign MC Where exists (select 1 from Market_Campaign MCI where MCI.user_id = MC.user_id and MC.created_at > MCI.created_at) and not exists (select 1 from Market_Campaign MCI where MC.user_id = MC.user_id and MC.created_at > MCI.created_at and MC.product_id = MCI.product_id)
Nice explanation. But while testing it seems that it is missing one edge case. A user bought same product on Day 1 and Day 2 but he bought different products on Day3 and Day4. To cover all edge cases you can use below query: Here is dataset (check user_id 6): User_id Prod_id Created_at 1 11 1/1/2022 1 12 1/2/2022 2 13 4/1/2000 2 13 2/1/2022 3 14 3/1/2022 4 15 3/15/2022 4 16 3/15/2022 5 17 5/15/2022 5 18 5/16/2022 5 18 5/17/2022 SELECT COUNT (DISTINCT USER_ID) FROM USER3 WHERE USER_ID IN ( SELECT USER_ID FROM USER3 GROUP BY USER_ID HAVING COUNT (DISTINCT CREATED_AT) > 1 AND COUNT (DISTINCT PRODUCT_ID) > 1) AND USER_ID NOT IN ( SELECT USER_ID FROM (SELECT PRODUCT_ID, USER_ID, RANK () OVER (PARTITION BY USER_ID ORDER BY CREATED_AT) RN FROM USER3) A WHERE RN IN (1, 2) --rank 1 and 2 are the 1st and 2nd orders which should be different GROUP BY USER_ID HAVING COUNT (DISTINCT PRODUCT_ID) = 1 )
My Solution: with first_user_purchase as (select * from (select user_id, created_at, string_agg(distinct product_id::varchar, ';#') first_product, dense_rank() over (partition by user_id order by created_at::date) rank from marketing_campaign group by 1, 2)x where rank=1) select * from (select t1.user_id, t1.created_at, t1.product_id, t2.first_product from marketing_campaign t1 join first_user_purchase t2 on t1.user_id=t2.user_id and t1.created_att2.created_at)x where first_product not like '%'|| product_id ||'%';
Hi Nate, Don't know if you're gonna see this. But I religiously followed your youtube channel and even took a subscription at StrataScratch for an amazon interview for the role of data engineer. I can't believe that I cracked it, just got the mail that I got selected!!! Thank you so much. Everything you taught was put to right use and came in handy during my interviews! THANK YOU SO MUCH OMG ILY
Congrats on your new role! Very happy for you and I'm glad you found the videos useful. Thank you so much for the kind words. Good luck on the start of your new job!
that was a great explanation for a tough problem, Really could not think of all scenarios. I agree there isn't much good quality content of these series out there. Thanks for doing these
Thanks so much Priyanka! I'm glad I can add to the technical content out there on UA-cam.
Man, thanks for these videos. I'm in an interview process and always get nervous or mix up in tech interviews. These videos are really helpful to structure the questions and the answers! Already subbed!
Thanks for watching and I'm glad you found this channel! I'll keep posting more videos for sure. Let me know if you have any topics you'd like me to cover. I always try to make videos that my audience wants.
@@stratascratch keep going with your current content IMO. As you said in another video, there isn't much of this on YT! (or good quality at least)
@@LeviewFPV Then I'll keep doing what I do! Thanks for watching and for your input.
Dude, seriously helpful stuff here to walk through your thought process. Very hard to translate practical experience into words, but it was extremely beneficial into breaking this down into digestible parts!
Thanks so much for watching and for the kind words! Hope you stick around. Will be creating more and more videos!
excellent way of explanation👍👌
Thank you!
Thank you so much for this content . Can you start similar series for python interview for data science.
I love how you break the question down and explained it . Thank so for this video and for sharing the link to that question.
Glad it was helpful!
I find your videos very useful but in this case I think your solution is a little more complicated than needed. You could select the user if and their min purchase date and create a table with just what they bought the first time. You can then join on the min date and eliminate all the records of the first purchase date. You then do the same join but using product id. You are then left with a table that exclude for each user their first date purchase and any latter purchase that includes any of the products purchased the first date. The count of distinct user is gives you the result of how many users were reached by your campaign.
Since you can see multiple solutions for the same problem this is one of my favorite platforms to practice SQL skills.
Thanks for the kind words! It's one of the features everyone loves.
Very clear and nice explanation. I'm watching these out of pure curiosity to discover new ways I can approach problems. If problems/explanations on the platform are just as real-world oriented it might be worth to sign up 😏
Thanks for watching! Yes, those questions on the platform are real interview questions from companies. Many of them are straight out of interviews, while others are re-structured but still test for the same concepts. There are a few difficult ones I put on the platform to show you what a few real world questions would look like on the job. Hope they are helpful! There's 50 free questions on the platform so just do those until you want more=)
@@stratascratch are those 50 free ones important too?
@@akashchandra2223 Yes, they are. They are/were interview questions from companies.
These kinda questions will only improve the ability to solve them. Please keep posting advance SQL questions
Thanks for watching! Will keep doing these videos!
Plz make more videos with advanced SQL querys
Very useful. Thank you!
You're welcome. I am glad it helped you.
This one is pretty good. I needed to improve on understanding the logic of solving difficult SQL questions. Will have to practice more.
Recap:
1. Identify all scenarios and edge cases.
2. Individually map out solution for each scenario.
3. Keep the logic separate from each other.
I will always miss out on many products many days but same as the first purchase scenario. No matter how much i revise this same question and answer.
Thank you for explanation, this question is quite difficult.
Man this channel should be given a Nobel prize
=)
this is very helpful thanks
You're welcome. Visit our channel for more helpful videos.
I have learned a lot from you. I am trying to use your framework to in my following interviews.
Thanks great. Glad you watched the videos!
Thanks a lot Nate!
what about using CTEs...
with fst as (
select user_id, min(created_at) firstDate
from marketing_campaign
group by user_id),
firstOrders as (
select * from marketing_campaign a
where exists (select 1 from fst b where b.user_id=a.user_id and b.firstDate=a.created_at)),
newOrders as (
select * from marketing_campaign a
where not exists (select 1 from firstOrders b where b.user_id=a.user_id and
(b.created_at=a.created_at or b.product_id=a.product_id)))
select count(distinct user_id) from newOrders
put it in the platform and see if it validates! =)
the last scenario says multiple products on multiple days, shouldn't 46 not be in included in the marketing campaign as well ?
A difficult but inspiring question
Probably one of the more difficult questions you'd see out there. The last part in handling one of the edge cases was definitely difficult so I hope I explained it right.
Easier solution
select count(distinct m.user_id) from marketing_campaign m,
(select user_id, min(created_at) from marketing_campaign group by user_id) a where
a.user_id = m.user_id and
m.product_id not in (
select product_id from marketing_campaign where created_at=a.min and user_id=a.user_id )
Much more efficient. Thanks for watching the video.
Still trying to understand it but looks really good
I think the reason this is confusing for me to read is that there’s a subquery in the from clause along with a table…..but I’m glad you shared this. It’s always helpful to see the different ways to solve the same problem, keeping in mind efficiency too. I’d also like to know how long it took people like you and Nate to be this good at SQL lol. I’m still new but sometimes it’s frustrating when I don’t learn everything as fast as I’d like to.
I have written a similar code but im getting 1 extra user id (24 instead of 23). Could anyone tell me what is wrong.
select count(distinct(a.user_id)) from marketing_campaign a
inner join (select user_id,created_at,b.product_id from marketing_campaign b group by 1,2,3 )b on a.user_id = b.user_id and a.created_at < b.created_at and a.product_id b.product_id
;
The actual process for uncovering this is so elegant though-the initial word problem made me discouraged tbh
It's definitely not great phrasing on the problem. I may have reworded it a bit on the platform but it's probably still pretty difficult to really understand. This was a last rounder, in-person interview question that's meant to be comprehensive so I tried my best to word the problem as concisely as possible. Thanks for the feedback and thanks for watching!
Amazing Nate. Thanks for this
Thank you for watching the video!
QQ Nate: If we remove 1st subsql, then also code should work. What is the purpose to select user_id from the subsql? We should just find those combination of concat(userid,productid) that are bought on 1st day, and just filter out from main table. please suggest.
I believe you can do it that way. My code isn't the most efficient. A lot of that has to do with how I would explain the code to an interviewer. It's sometimes easier to write less efficient code but have the code follow logic step by step, especially if you're trying to walk someone through your solution. The last part of the interviews are usually talking about how to optimize the code, so that's where your version would work.
@@stratascratch thank you for your advise.
Nate this is a really helpful video, as are all of these where you work through a real problem. As someone wanting to become a data analyst, I’m wondering how in-depth my SQL interview questions are likely to be, vs those for a data science position. Any idea?
For SQL, there is no difference between a DA and DS. The DS position differs because you need to know much more modeling and statistics.
@@stratascratch thanks Nate!
After getting a job will stratscratch help me to become expert in database or this is just for interviews ?
Great video! But, I'm wondering whether there is a candidate who can finish this problem within 30 min during a Amazon tech interview.
This is a question that is likely for a white boarding session that would take place on an on-site interview. I believe you could finish the problem in 30-min on a white board since it's more collaborative with the interviewer than coding on a text editor via Zoom. =)
With userFirstPurchase(user_id, first_purchase_dt,created_at, product_id)
as
(select user_id, min(created_at) over (partition by user_id), created_at, product_id
from marketing_campaign )
select count(distinct user_id)
from userFirstPurchase
where created_at > first_purchase_dt
and concat(user_id,product_id) not in (select concat(user_id,product_id) from userFirstPurchase where first_purchase_dt=created_at )
How about this?
select count(distinct user_id) from marketing_campaign where concat(user_id,'_',product_id) not in(
select usr_prd from (select concat(user_id,'_',product_id) usr_prd, created_at, rank() over(partition by user_id order by created_at asc) rnk from marketing_campaign ) up where rnk = 1
);
subquery creates a list of user_id+product_id for all first transactions. I am using rank window function to identify all first transactions. After that using "not in" clause to select all other combinations.
Looks like it would work! Test it out on the platform!
@@stratascratch yeah i tested this, works fine :)
Will it work if instead of concatenating the user_id and product_id column, we just take product_id from the subquery and do product_id NOT IN subquery?
That sounds like a good approach. You can code it up on the platform and see if it produces the same output.
@@stratascratch Yes, I will try it and share the results with you in the comments.
@@stratascratch I got your point about why we need to use CONCAT. If we don't create a new column, we will lose all the required rows as wel because they have the same ID's as well. Very nice question and an even better explanation.
Hi Nate, love your channel, and I have binge-watching your videos for the last 3 days.
More power to you, for helping all of us.
I came up with this solution using a self join:
select
COUNT(DISTINCT md1.user_id)
from
marketing_campaign md1 --day one
JOIN
marketing_campaign md2 --day two
ON md1.user_id=md2.user_id --same user purchasing again
AND md1.created_at::date
That's a nice solution. You're missing an edge case where a user cannot purchase the same items as the items in their first purchase. You have user 25 in your output but if you look at the raw data, user 25 purchase both item id 114, 115 during their 1st purchase. And then purchased the same items in the future. So 114 and 115 should not count and thus user 25 should not be counted.
My code:
concat((user_id),'_', (product_id)) not in
(SELECT user_product
FROM
(SELECT *,
rank() over(PARTITION BY user_id
ORDER BY created_at) AS rn,
concat((user_id),'_', (product_id)) AS user_product
FROM marketing_campaign
Handles this exact edge case.
I still don't understand,why 46 is in the marketing campaign.. if on next date he bought the same product?
Hey Nate,
I don't understand why do we have to remove all the users with rank 1 in the second part of the query. In that, not all users buy the same product on a different date!
I'm actually only removing user-product combo for the user's 1st purchase because I don't care about their 1st transaction. If the user purchased the same items on a different date, then those transactions are also removed, which is what we want to do according to the question. What is kept are the products that are different than those the user purchased in a future transactions. This is definitely a difficult question to keep all cases in your head. What I'd do is follow a few of the users that I mentioned in the video. Head over to the platform and for each logic in the query, output the user and products purchased. Then see how specific rules filter out users. Hope that helps!
@@stratascratch Yeah, that helps now. I got confused for a while but I got it now. Thanks!!
Thanks a lot Sir
Thanks for watching!
select count(distinct mc1.user_id)
from marketing_campaign mc1
join marketing_campaign mc2 on
mc1.user_id=mc2.user_id and mc2.created_at>=mc1.created_at+interval '1 day'
and mc2.product_id!=mc1.product_id
i am getting 24 and not 23? can anyone tell
Thank You !!!
Thanks for watching. Please let me know if you have any topics you'd like me to cover!
@@stratascratch advanced SQL. basically, continue with these videos, please.
Do you need video editor? I'll do it for you
I'm okay for now. Thanks man!
My answer;
SELECT
COUNT(DISTINCT user_id)
FROM(
SELECT
user_id
,created_at
,product_id
,last_order
,DATEDIFF(created_at,last_order) AS date_diff
FROM(
SELECT
user_id
,created_at
,product_id
,LAG(created_at) OVER(PARTITION BY user_id ORDER BY created_at) last_order
FROM(
SELECT
user_id
,created_at
,product_id
,ROW_NUMBER()OVER(PARTITION BY user_id,product_id ORDER BY created_at) AS row_num
FROM marketing_campaign)a
WHERE row_num = 1)b)c
WHERE date_diff > 0
User 50 bought
product 118 x4 at a price of 35
Then next day
product 118 x4 at a price of 29
Then three weeks later
product 118 x5 at a price of 299
Then next day
product 118 x2 at a price of 199
Must have been GameStop stocks.
Hi Nate!
Hope you are doing fine.
Please check your email.
WITH CTE AS (
SELECT
user_id,
product_id,
created_at,
RANK() OVER (PARTITION BY user_id ORDER BY created_at) AS rn
FROM
marketing_campaign
),
firstDayPurchases AS ( SELECT * FROM CTE WHERE rn = 1 ),
subsequentPurchases AS ( SELECT * FROM CTE WHERE rn > 1 )
SELECT
COUNT(DISTINCT user_id)
FROM
subsequentPurchases
WHERE
(user_id, product_id) NOT IN (SELECT user_id, product_id FROM firstDayPurchases)
Hello. Not "hey guys".
Hello
Exists - ensures that the current row is not the first transaction for the user
Not exists - ensure that the current product in the row is bought previously
Select count(distinct user_id) from Market_Campaign MC
Where exists (select 1 from Market_Campaign MCI where MCI.user_id = MC.user_id and MC.created_at > MCI.created_at)
and not exists (select 1 from Market_Campaign MCI where MC.user_id = MC.user_id and MC.created_at > MCI.created_at and MC.product_id = MCI.product_id)
Nice explanation. But while testing it seems that it is missing one edge case. A user bought same product on Day 1 and Day 2 but he bought different products on Day3 and Day4. To cover all edge cases you can use below query:
Here is dataset (check user_id 6):
User_id Prod_id Created_at
1 11 1/1/2022
1 12 1/2/2022
2 13 4/1/2000
2 13 2/1/2022
3 14 3/1/2022
4 15 3/15/2022
4 16 3/15/2022
5 17 5/15/2022
5 18 5/16/2022
5 18 5/17/2022
SELECT COUNT (DISTINCT USER_ID)
FROM USER3
WHERE USER_ID IN
( SELECT USER_ID
FROM USER3
GROUP BY USER_ID
HAVING COUNT (DISTINCT CREATED_AT) > 1 AND COUNT (DISTINCT PRODUCT_ID) > 1)
AND
USER_ID NOT IN
(
SELECT USER_ID
FROM (SELECT PRODUCT_ID,
USER_ID,
RANK () OVER (PARTITION BY USER_ID ORDER BY CREATED_AT) RN
FROM USER3) A
WHERE RN IN (1, 2) --rank 1 and 2 are the 1st and 2nd orders which should be different
GROUP BY USER_ID
HAVING COUNT (DISTINCT PRODUCT_ID) = 1
)
My Solution:
with first_user_purchase as
(select *
from
(select user_id,
created_at,
string_agg(distinct product_id::varchar, ';#') first_product,
dense_rank() over (partition by user_id order by created_at::date) rank
from marketing_campaign
group by 1, 2)x
where rank=1)
select *
from
(select t1.user_id,
t1.created_at,
t1.product_id,
t2.first_product
from marketing_campaign t1
join first_user_purchase t2 on t1.user_id=t2.user_id
and t1.created_att2.created_at)x
where first_product not like '%'|| product_id ||'%';