![]() |
| |||||||
| Home | Register | Blogs | FAQ | Members List | Calendar | Downloads | Arcade | Mark Forums Read |
| Algorithms The source of algorithms for your project |
![]() |
| | LinkBack | Thread Tools | Display Modes |
| | #1 |
| Administrator Join Date: Sep 2001 Location: Yerevan, Armenia
Posts: 6,995
Blog Entries: 14 Rep Power: 10 Reputation:
183 | Knuth-Morris-Pratt algorithm Main features
The design of the Knuth-Morris-Pratt algorithm follows a tight analysis of the Morris and Pratt algorithm. Let us look more closely at the Morris-Pratt algorithm. It is possible to improve the length of the shifts. Consider an attempt at a left position j, that is when the the window is positioned on the text factor y[j .. j+m-1]. Assume that the first mismatch occurs between x[i] and y[i+j] with 0 < i < m. Then, x[0 .. i-1] = y[j .. i+j-1] =u and a = x[i] y[i+j]=b.When shifting, it is reasonable to expect that a prefix v of the pattern matches some suffix of the portion u of the text. Moreover, if we want to avoid another immediate mismatch, the character following the prefix v in the pattern must be different from a. The longest such prefix v is called the tagged border of u (it occurs at both ends of u followed by different characters in x). This introduces the notation: let kmpNext[i] be the length of the longest border of x[0 .. i-1] followed by a character c different from x[i] and -1 if no such tagged border exits, for 0 < i m. Then, after a shift, the comparisons can resume between characters x[kmpNext[i]] and y[i+j] without missing any occurrence of x in y, and avoiding a backtrack on the text (see figure 7.1). The value of kmpNext[0] is set to -1.![]() Figure 7.1: Shift in the Knuth-Morris-Pratt algorithm (v border of u and c b).The searching phase can be performed in O(m+n) time. The Knuth-Morris-Pratt algorithm performs at most 2n-1 text character comparisons during the searching phase. The delay (maximal number of comparisons for a single text character) is bounded by log (m) where is the golden ratio ( ).The C code void preKmp(char *x, int m, int kmpNext[]) { int i, j; i = 0; j = kmpNext[0] = -1; while (i < m) { while (j > -1 && x[i] != x[j]) j = kmpNext[j]; i++; j++; if (x[i] == x[j]) kmpNext[i] = kmpNext[j]; else kmpNext[i] = j; } } void KMP(char *x, int m, char *y, int n) { int i, j, kmpNext[XSIZE]; /* Preprocessing */ preKmp(x, m, kmpNext); /* Searching */ i = j = 0; while (j < n) { while (i > -1 && x[i] != y[j]) i = kmpNext[i]; i++; j++; if (i >= m) { OUTPUT(j - i); i = kmpNext[i]; } } } The example Preprocessing phase ![]() The kmpNext table First attempt G C A T C G C A G A G A G T A T A C A G T A C G 1 2 3 4 G C A G A G A G Shift by: 4 (i-kmpNext[i]=3- -1) Second attempt G C A T C G C A G A G A G T A T A C A G T A C G 1 G C A G A G A G Shift by: 1 (i-kmpNext[i]=0- -1) Third attempt G C A T C G C A G A G A G T A T A C A G T A C G 1 2 3 4 5 6 7 8 G C A G A G A G Shift by: 7 (i-kmpNext[i]=8-1) Fourth attempt G C A T C G C A G A G A G T A T A C A G T A C G 1 G C A G A G A G Shift by: 1 (i-kmpNext[i]=1-0) Fifth attempt G C A T C G C A G A G A G T A T A C A G T A C G 1 G C A G A G A G Shift by: 1 (i-kmpNext[i]=0- -1) Sixth attempt G C A T C G C A G A G A G T A T A C A G T A C G 1 G C A G A G A G Shift by: 1 (i-kmpNext[i]=0- -1) Seventh attempt G C A T C G C A G A G A G T A T A C A G T A C G 1 G C A G A G A G Shift by: 1 (i-kmpNext[i]=0- -1) Eighth attempt G C A T C G C A G A G A G T A T A C A G T A C G 1 G C A G A G A G Shift by: 1 (i-kmpNext[i]=0- -1) The Knuth-Morris-Pratt algorithm performs 18 character comparisons on the example. http://www-igm.univ-mlv.fr/~lecroq/string/examples/exp8.html |
| | |
![]() |
| Thread Tools | |
| Display Modes | |
| |
Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| plss help...booth's multiplication algorithm | ayaksah | Algorithms | 1 | Sep 10, 2005 11:22 |
| Wired Equivalent Privacy (WEP) algorithm (part of the 802.11) | greka | Software Security | 0 | May 29, 2004 14:06 |