I think this is one of the most required explanations for KMP. It is missing almost everywhere else and the algorithm is being blindly followed incase a mismatch occurs. Thanks a LOT.
@Akhil Raj the time complexity is 0(m+n) as, we are traversing the whole array once ,the the given pattern and the constructed array. I think now you have understood .
@@lavishgarg4274 But if there are lot of backtraversals in bigger array(worst case scenario), Will it not be: O(m+x). x being number of steps taken back.
@@SearchingPeaceInside Hello, The worst case scenario is which it has to back travel in every case, so, in either step we either increase our iterator, or we slide our pattern to be matched. Worst case, you'll slide the pattern once for each iteration, hence this will be O(2*N) which is basically O(N) solution. I hope I'm correct :)
Even Abdul Bari didn't explain it in so much detail. Best explanation on the planet!!!! Thank you so much I thought I'll never understand this algorithm but you've made it so easy. Thanks again!
I've spent more than a week reading articles and watching tutorials to figure out how the hell these values came from but here it is explained in just 9 minutes. Clearest explanation seen so far!!
He's best. I have seen several other videos but none of them has explained why we have to look at previous index and then find out the longest suffix which is also prefix. Reason is : Let's say we have come across a mismatch at some "j" & "i" , j < i : 1. character at index "j" is not matching with character at index "i". 2. We know that the prefix (0, j-1) matches with (i-j, i-1). 3. Now we have to look at what is the longest suffix which is also prefix for the substring (0, j-1). And this will also be the longest suffix for string ending at index "i-1". This is the trick. Thank you Tushar!! You nailed it!! More power to you!!
Thanks for both those videos. YT recommended me this video. I came in only hearing about KMP and 20 minutes later came out with a good grasp of the algorithm. Great work :)
the interesting thing to notice is that the logic is very similar for when we are creating the preprocessed array and when we are matching the pattern.
Excellent explanation! The basic idea is to use the already calculated prefix array (or say Partially Match Table on somewhere) when you are calculating/building the prefix array, if I understand that correctly. Thank you very much!
FOR LAST MISTMATCH : The first 11 are same with last 11 i.e 0-10 = 6-16 now if 11 and 17 were same we could say 0-11 = 6 -17 but that's not the case so we see how many char are same in first 0-10 which is lps[10] = 5 so 0-4 = 6-10 or 0-4 = 12 -16 as 6-10 = 12 -16 so if 5 and 17 were equal we could say 0-5 = 12-17 but again not the case so how many are equal in first 0-4 i.e = 3 so 0-2 = 14-16 and 3 = 17 so 0-3 = 14-17 i.e. 4 length ps: Thank you Tushar for a rly good explanation!
Hello , May you change it to C++ code ? I mean the thing that you wrote for KMP substring search from a file . It is my DSA group project but i tried too much could not . if u can . thank you so much .
Our group project is about searching sub string from a file using KMP algorithm in c++ . I tried a lot to convert your coding from java to c++ but could not . May be i need more time to do it . But i have a lot of things to do from other subjects . If you can plz help us . again tnqs a lot .
Good stuff! At the very end, after you set tmp[17] = 4; do you increment 'j' since the values 'c' match? In other words, is j==4 at the very end? This becomes important if we have a longer input.
Key logic - What happens on a mismatch ? What we are really doing when we reset j = A[j-1] is considering prefixes and suffixes of the longest prefix that we found on the first mismatch ... this way we are first considering the longest suffix( thats a prefix ) and then the 2nd longest , and then the 3rd longest and soon until j=0.
Hi malhar. I am still not quite sure here. Why we can safely reset j = A[j - 1]? 'j' here points to the end of longest prefix (it's also suffix) of s[0:i] (substring we have seen so far). For example, in the last part of video. When we move i from 16 to 17 (j from 10 to 11) and find s[j] != s[i], why we can safely jump from j = 10 to j = A[j - 1] = 5 instead of j = 10 to j = 9? How can we know that j = 6, 7, 8, 9 are all impossible as the candidate prefix length and safely discard them? Could you explain this part? Thanks very much!
+yusong shen - Ok ,Let me clarify - 1) I feel your concept of "safety" is slightly incorrect , please correct me if I'm wrong. On seeing a mismatch ,"Safety" should mean that we should not skip past any length of the pattern and should start at j=0. 2) According to your question , at j=10 , if you are going to j = 5 , then dont you think that should be "safer" than going to j=9 ? 3) "Why is it safe to skip past j=0,1,2,3,4" is what I feel you should have asked. In short - I feel going backward in the pattern ( and then checking for matching ) should be safer than going forward ( and then checking for matching )
To explain why his method is correct - 1) When you get a mismatch , if j > 0 , then that means before the mismatch , you matched upto a lenght of pattern "j-1" ( if j > 0 ) I will refer to j and i as "mismtach index" for now. ... where s[i] ! = s[j]. 2) The suffix of that pattern ( from 0 to j ) is also the prefix of the pattern just before the mismatch index i. ( Just see the last bit of the video - I am referrring to "acaca" from index 0 to 4, and 12 to 16. 3) Hence on seeing the mismatch , from 1) above , you know that you have "safely" matched from index 0 to 4 ( inclusive of 0 and 4 ) in the pattern. This is the info given by preprocessed array. why he keeps resetting j = A[j-1] is slightly trickier but intuitively ( and can be understood by drawing it out ) by what I explained in the ansewer above.
I wonder if the last index value needs to be calculated, because we always only need the previous index value. Such as in this example, there is no need to figure out the value 4 for index 17.
amazing video. Question: if we do such jumps to find candidate J for each i... then how are we saying this Prefix construction is O(n). for each i isnt there a possibility that we might do j jumps (till we reach j=0) and hence it is O(n^2)
I think this is one of the most required explanations for KMP. It is missing almost everywhere else and the algorithm is being blindly followed incase a mismatch occurs. Thanks a LOT.
Pls sir lyrics show in down side b/c problems solution not seeing properly thanking you
@Akhil Raj the time complexity is 0(m+n) as, we are traversing the whole array once ,the the given pattern and the constructed array.
I think now you have understood .
@@lavishgarg4274 But if there are lot of backtraversals in bigger array(worst case scenario), Will it not be: O(m+x). x being number of steps taken back.
@@SearchingPeaceInside Hello, The worst case scenario is which it has to back travel in every case, so, in either step we either increase our iterator, or we slide our pattern to be matched. Worst case, you'll slide the pattern once for each iteration, hence this will be O(2*N) which is basically O(N) solution. I hope I'm correct :)
Even Abdul Bari didn't explain it in so much detail. Best explanation on the planet!!!! Thank you so much I thought I'll never understand this algorithm but you've made it so easy. Thanks again!
Exactly..this guuuy explained this topic easily without confusion
Tried like 15 tutorials before this one ! and my search ends :) superb keep it up
you became tired i think
I've spent more than a week reading articles and watching tutorials to figure out how the hell these values came from but here it is explained in just 9 minutes. Clearest explanation seen so far!!
same here bro...
He's best. I have seen several other videos but none of them has explained why we have to look at previous index and then find out the longest suffix which is also prefix.
Reason is :
Let's say we have come across a mismatch at some "j" & "i" , j < i :
1. character at index "j" is not matching with character at index "i".
2. We know that the prefix (0, j-1) matches with (i-j, i-1).
3. Now we have to look at what is the longest suffix which is also prefix for the substring (0, j-1). And this will also be the longest suffix for string ending at index "i-1". This is the trick.
Thank you Tushar!! You nailed it!! More power to you!!
Way better than other complicated explanations.
Please make a series on tough interview questions asked in Google, Amazon etc.
I rarely comment.
"This is by far the best explaination on KMP algorithm I found on UA-cam. And I've watched lots of other videos."
Thanks for both those videos. YT recommended me this video. I came in only hearing about KMP and 20 minutes later came out with a good grasp of the algorithm. Great work :)
Once lived a legend, Tushar Roy
Finally with this video i understood the whole logic behind building LPS[ ] array. Very Clear and Concise explanation.
Thanks.
the interesting thing to notice is that the logic is very similar for when we are creating the preprocessed array and when we are matching the pattern.
True
Exactly 😌
i was totally upset in the part 1 now i am fully cleared....thank you tushar....
A great example is always better than a lot of talking information. Thank you!
Thanks! This was a great video. Computing the failure function is the trickiest part of the KMP algorithm!
so lucky to find you...simple and clear way Tushar Sir.
Pretty much useful for everything, Competitive Programming and Interviews. :)
Keep the work going on.
Man, you gotta realize that how much you help me, thanks you very much , you're my god!!
It's a awesome exaplanation of KMP and i really thank's a lot.
Thank Tushar sir. You are doing amazing job for us. I go through the many tutorials but didn't understand KMP. Now its completely clear. Thanks a lot.
Way better than any other explanation article or tutorial on the internet (definitely true for a rookie like me).
Thanks Tushar for a clear and concise video explanation
lol it took me like an entire day to try and figure this out ... thanks a lot for the explanation !
Well done! Gautam Gambir
Finally someone talks about what the number stands for...thank you!
Normally, I don't comment. But you are just so amazing. Plz keep uploading such kind of videos. Thank you so much
Finding the value of pi for last index clears all d doubts!!!!!!
themks a lawt!!!
surely One of the best videos on KMP algo.
keep it up..
please make video on DP bitmasks
This is an explanation that could have only been done if you have deep understanding of KMP. Amazing explanation.
Extremely edifying. Keep up the good work.
I am wondering why no any new videos are coming up now?
Great channel.
Awesome explanation, Loved it!!!
Excellent explanation! The basic idea is to use the already calculated prefix array (or say Partially Match Table on somewhere) when you are calculating/building the prefix array, if I understand that correctly. Thank you very much!
Best Explanation finally i Got the solution
Great explanation Tushar Roy. Keep uploading awesome tutorials like this one. :)
TY for all your videos and my BSC =)
Sir a BIG THANKS FROM THAPAR UNIVERSITY, PATIALA. YOU DESERVE A GOLD MEDAL. PLEASE BECOME A PROFESSOR HERE.
Degree hogi poori Teri kaka?
You are really helping me study ! Thank you for this.
Sir g kia samhjaya hai
Me apka bahut bara fan hogya hun
The best video I saw about KMP array. It's better if you can give more details about why we handle the case in this way when i and j doesn't match.
nicely explained, i tried it from coreman earlier, but now its more clear to me ... thanks dear :)
thanks Tushar you explained it well.
I love so much, Tushar.
This is great! Helped me a lot. Thanks
your videos are awsome !
helped me such in graphs and also kmp
finally I understood why we go to j-1 after looping through 6:08 to 8:30
What you understood?
Beautifully explained. Thank you so much :)
Helped me a lot. Thank you!
Thank you so much! It helped me a lot.
This one is really helpful to understand why j keeps backtracking to j - 1 position! Thanks a ton!
It's not called backtracking
thank u very much ...great explanation with good examples.
nice video..it clears my concepts...thanks
sir i think last one will 3 instead of 4. sir please explain if i m wrong
Very nicely explained.Thanks sir
your published this video on my birthday...14 june ..
Finally, My search ends after watching more than 10 videos...
Very helpful video. Thank you!
bahut bdiya roy
bhai
best explanation ever.
meh could be better
thank you sir..ur tutorials helps me lot>>>>...
Nice work Tushar!
Sir again awesome video.... sir when you are going to make videos on graph ? eagerly waiting to watch graph videos....
excellent expalanation sir
Amazing sir 👏
beautifully explained, thank you sir
Thank you Tushar !
Thank you! Great job :)
FOR LAST MISTMATCH :
The first 11 are same with last 11 i.e
0-10 = 6-16 now if 11 and 17 were same we could say 0-11 = 6 -17 but that's not the case
so we see how many char are same in first 0-10 which is lps[10] = 5
so 0-4 = 6-10 or 0-4 = 12 -16 as 6-10 = 12 -16
so if 5 and 17 were equal we could say 0-5 = 12-17 but again not the case
so how many are equal in first 0-4 i.e = 3
so 0-2 = 14-16 and 3 = 17 so 0-3 = 14-17 i.e. 4 length
ps: Thank you Tushar for a rly good explanation!
Excellent explanation!
Fabulous explaination !!!
and thank you very much for uploading this useful video... :)
Thanks a lot . very very excellent and beneficial .
Hello , May you change it to C++ code ? I mean the thing that you wrote for KMP substring search from a file . It is my DSA group project but i tried too much could not . if u can . thank you so much .
Our group project is about searching sub string from a file using KMP algorithm in c++ .
I tried a lot to convert your coding from java to c++ but could not . May be i need more time to do it . But i have a lot of things to do from other subjects . If you can plz help us . again tnqs a lot .
ok thanks a lot .
this video just helped me...thnx
Awesome tushar...
Good stuff! At the very end, after you set tmp[17] = 4; do you increment 'j' since the values 'c' match? In other words, is j==4 at the very end? This becomes important if we have a longer input.
very nice explanation
really helped me a lot .... thanks bro!!!
Thanks for the nice video man!
waw you give so amuesing tutorial thanks so much!!!!!
You are the best.. Thanks (y)
thank you so much clarifying sir
Key logic - What happens on a mismatch ?
What we are really doing when we reset j = A[j-1] is considering prefixes and suffixes of the longest prefix that we found on the first mismatch ... this way we are first considering the longest suffix( thats a prefix ) and then the 2nd longest , and then the 3rd longest and soon until j=0.
Hi malhar. I am still not quite sure here. Why we can safely reset j = A[j - 1]? 'j' here points to the end of longest prefix (it's also suffix) of s[0:i] (substring we have seen so far). For example, in the last part of video. When we move i from 16 to 17 (j from 10 to 11) and find s[j] != s[i], why we can safely jump from j = 10 to j = A[j - 1] = 5 instead of j = 10 to j = 9? How can we know that j = 6, 7, 8, 9 are all impossible as the candidate prefix length and safely discard them?
Could you explain this part? Thanks very much!
+yusong shen -
Ok ,Let me clarify -
1) I feel your concept of "safety" is slightly incorrect , please correct me if I'm wrong.
On seeing a mismatch ,"Safety" should mean that we should not skip past any length of the pattern
and should start at j=0.
2) According to your question , at j=10 , if you are going to j = 5 , then dont you think that should be "safer" than going to j=9 ?
3) "Why is it safe to skip past j=0,1,2,3,4" is what I feel you should have asked.
In short - I feel going backward in the pattern ( and then checking for matching ) should be safer than going forward ( and then checking for matching )
To explain why his method is correct -
1) When you get a mismatch , if j > 0 , then that means before the mismatch , you matched upto a lenght of pattern "j-1" ( if j > 0 )
I will refer to j and i as "mismtach index" for now. ... where s[i] ! = s[j].
2) The suffix of that pattern ( from 0 to j ) is also the prefix of the pattern just before the mismatch index i.
( Just see the last bit of the video - I am referrring to "acaca" from index 0 to 4, and 12 to 16.
3) Hence on seeing the mismatch , from 1) above , you know that you have "safely" matched from index 0 to 4 ( inclusive of 0 and 4 ) in the pattern. This is the info given by preprocessed array.
why he keeps resetting j = A[j-1] is slightly trickier but intuitively ( and can be understood by drawing it out ) by what I explained in the ansewer above.
How do we know that the characters from j=0 to j=4 will match those from j=12 to j=16?
I wonder if the last index value needs to be calculated, because we always only need the previous index value. Such as in this example, there is no need to figure out the value 4 for index 17.
Great Explanantion really
Thank you very much.謝謝你。
Nice explanation...
as always.. great explaination!!
Police sirens in the background at 3:11 ?? :P Great video btw....thnx
Sreerag Nair Algorithms are illegal, you know
Kote Kutalia hahaha.... good one
you are superb sir
Thank you for explaining
Great Explanation.
+Tushar Roy they are used for polynomial evaluations and related stuff
amazing video.
Question:
if we do such jumps to find candidate J for each i...
then how are we saying this Prefix construction is O(n).
for each i isnt there a possibility that we might do j jumps (till we reach j=0)
and hence it is O(n^2)
I also has the same question. Can someone explain this please?
Is this algorithm is fastest algo to search string pattern...?
3:10 Cops are arriving to learn KMP too.
Thanks A Lot Sir!!
can you do the version where you use automata? can't find a decent explanation to that ( how to build it mainly)
You did use a eraser, thanks man
awesome video Tushar Sir.If u could please give a video on fast fourier transforms and inverse of fft
Thank you so much Sir ...