We could have also done it using Lead-Lag functions, these would make it little more easier!
But if some one login on 31sth of the month then next month 1st day then day logic will be 31 and 01. These two days will not come under same group
with cte as
(select *,
row_number() over() as pw
from pwc_attandance_log),
cte1 as (select emp_id,log_date,flag,pw,
pw - (row_number() over() ) as ps
from cte
where flag = 'Y'),
cte2 as (select *, count(ps) over(partition by ps) as vv
from cte1 )
select emp_id,min(log_date),max(log_date),vv as days
from cte2
where vv >= 2
group by vv,emp_id
order by emp_id
we can do this with one cte
with cte as (
select emp_id,log_date,flag,date-rn as group_val from (
select *,ROW_NUMBER() over (partition by emp_id order by emp_id) as rn
,DATEPART(MONTH,log_date) as date
from pwc_attandance_log where flag = 'Y'
) x
)
select emp_id,MIN(log_date) as start_date, MAX(log_date) as end_date ,COUNT(1) as con_date
from cte group by emp_id,group_val having count(1) > 1
with cte as
(select *, row_number() over(partition by emp_id order by log_date) as r,
DAY(log_date) AS "DAY", cast(DAY(log_date) as int) - row_number() over(partition by emp_id order by log_date) as diff
from pwc_attandance_log
where flag='Y')
select emp_id,min(log_date) as start_date, max(log_date) as end_date,count(diff)
from cte
group by emp_id,diff
having count(diff) >=2
order by emp_id;
can anyone provide solution using lag functionality
The Solution seems to have one issue when the dates are from different month .
My solution :
WITH CTE AS (select *,LAG(log_date) OVER(partition by emp_id order by log_date) as previous_log_date from pwc_attandance_log where flag='Y' )
,PQR AS (select *,CASE WHEN previous_log_date-LAG(previous_log_date) OVER(partition by emp_id order by previous_log_date)>1 then 1 else 0 END as group_no from CTE where DATE_PART('day',log_date)- DATE_PART('day',previous_log_date)=1 )
,XYZ AS (select *,SUM(group_no) OVER(partition by emp_id order by emp_id,previous_log_date) as g1 from PQR)
,last_1 as (select emp_id,g1,count(*)+1 as cnt ,min(previous_log_date) as min_date ,max(log_date) as max_date from XYZ group by emp_id,g1)
select emp_id,cnt as consecutive_days,min_date as start_date,max_date as end_date from last_1;
If a employee attend 1-05-2024, then again login 1-06-2024. Then what will happen?
using one cte
with cte as(
Select * ,day(log_date) - row_number()over(partition by emp_id order by log_date) as diff_ ,
row_number()over(partition by emp_id order by log_date) as rn,
day(log_date) as day_
from pwc_attandance_log
where flag='Y')
select emp_id,diff_, min(log_date) as start_date ,max(log_date) as end_date,count(*) as consucatie_date
from cte group by emp_id, diff_
having count(*)>1
What if the date contains value like 31JAN and 01FEB as consecutive days, in that case it will give wrong group_ value.
I think we can go for "day-of-year" in that case & then we can do the grouping.
Yes, instead of taking month from date we should take difference between current date and date column
This solution only works for a specific month. This will fail if the data is scaled over several months.
Yes correct,
To cover that scenario take the difference between the date and row number to make the group that covers all the cases...
I have covered this approach in my videos.
Thanks for the video.
I have one doubt . Lets say we have log_date as 02-01-2024 and the next date is 03--02-2024.
If we take day part out of them we will still get 2 and 3
Thanks if the data Is like that then logic remains the same we just need to apply the group clause on the year and month also then to make it unique.
Awesome solution😊