For the SQL question, an easier approach would be like this: with cte as( select machine_id, process_id, max(timestamp)-min(timestamp) as time from table group by 1, 2 ) select machine id, average(time) avg_time from cte group by 1
sql question approach- WITH a AS ( SELECT machine_id, process_id, activity_type, timestamp,ROW_NUMBER() OVER (partition by machine_id,process_id ORDER BY machine_id, process_id) AS rn FROM machine), b as (select a1.machine_id,a1.process_id,a1.activity_type as activity_type1,a1.timestamp as timestamp1, a2.activity_type as activity_type2,a2.timestamp as timestamp2,(a2.timestamp-a1.timestamp) as timestamp_diff from a a1 inner join a a2 on a1.machine_id=a2.machine_id and a1.process_id=a2.process_id and a2.rn=a1.rn+1) select round(avg(timestamp_diff),3) as avf,machine_id from b group by machine_id order by machine_id;
two approaches for the SQL question which came to my mind .. SELF JOIN ========== SELECT a1.machine_id as machine_id, round(sum(a2.timestamp - a1.timestamp)/count(distinct(a1.process_id)),3) as processing_time FROM Activity a1 INNER JOIN Activity a2 ON a1.machine_id = a2.machine_id AND a1.process_id = a2.process_id WHERE a1.activity_type = 'start' AND a2.activity_type = 'end' GROUP BY a1.machine_id ; CASE STATEMENT ================= SELECT machine_id, round(sum(case when activity_type = 'start' then -timestamp else timestamp end )/count(distinct(process_id)) , 3)as processing_time from Activity group by machine_id ;
for last few days I am having a lot of health issues. just recovering, I understand that I could not release it on time. I will bring these as soon as I feel a bit better.
My Approach python Coding Question - s = "stringity" i = 0 res = "" for str in range(len(s)): if s[str] == 'i': res += s[i:str][::-1] i = str+1 print(res+s[i:])
For the SQL question, an easier approach would be like this:
with cte as(
select machine_id, process_id, max(timestamp)-min(timestamp) as time
from table
group by 1, 2
)
select machine id, average(time) avg_time
from cte
group by 1
Reverse string on every occurrence of I letter
Avg of logged time
sql question approach-
WITH a AS ( SELECT machine_id, process_id, activity_type, timestamp,ROW_NUMBER() OVER (partition by machine_id,process_id ORDER BY machine_id, process_id) AS rn FROM machine),
b as (select a1.machine_id,a1.process_id,a1.activity_type as activity_type1,a1.timestamp as timestamp1,
a2.activity_type as activity_type2,a2.timestamp as timestamp2,(a2.timestamp-a1.timestamp) as timestamp_diff
from a a1 inner join a a2 on a1.machine_id=a2.machine_id and a1.process_id=a2.process_id and a2.rn=a1.rn+1)
select round(avg(timestamp_diff),3) as avf,machine_id from b group by machine_id order by machine_id;
two approaches for the SQL question which came to my mind ..
SELF JOIN
==========
SELECT
a1.machine_id as machine_id,
round(sum(a2.timestamp - a1.timestamp)/count(distinct(a1.process_id)),3) as processing_time
FROM Activity a1 INNER JOIN Activity a2
ON a1.machine_id = a2.machine_id AND a1.process_id = a2.process_id
WHERE a1.activity_type = 'start' AND a2.activity_type = 'end'
GROUP BY a1.machine_id ;
CASE STATEMENT
=================
SELECT
machine_id,
round(sum(case when activity_type = 'start' then -timestamp else timestamp end )/count(distinct(process_id)) , 3)as processing_time
from Activity
group by machine_id ;
Nice discussion, but we can't say it's data engineer mock interview.
First view , first like, first comment. 🎉
Sumit sir please upload the python video it’s been 10 days plz 🙏
for last few days I am having a lot of health issues. just recovering, I understand that I could not release it on time. I will bring these as soon as I feel a bit better.
@@sumitmittal07 ohh sorry sir couldn’t know this please tc first. Hope you have a speedy recovery 🙏
Yes sir please upload python videos we r waiting from long
Will upload soon
My Approach python Coding Question -
s = "stringity"
i = 0
res = ""
for str in range(len(s)):
if s[str] == 'i':
res += s[i:str][::-1]
i = str+1
print(res+s[i:])