Page List

Search on the blog

2011年3月4日金曜日

KMP -- String Search Algorithm --

Today I'm gonna write about KMP algorithm, which is an algorithm for string search designed by three researchers: Knuth, Morris and Pratt.

There are several string search algorithm out there. I know about BM method and KR method besides KMP.
But it seems like KMP method is the most common, and it is used more often than other algorithm.

The idea is quite simple.
This Wikipedhia page explains it quite well:

But it's not smart to implement the algorithm exactly same way as it's mentioned in the site above. You're so near but so far if you have a good grasp of the algorithm but cannot implement it well.

There's simpler implementation many algorithmers are using.
I was thrilled when I saw this implementation first. So simple and so smart!!
Here's my solution to a POJ problem with the implementation:



char text[1000000+1];
char word[10000+1];
int fail[10000+1];

void mkFail() {
    int n = strlen(word);
    int j = fail[0] = -1;

    for (int i = 1; i <= n; i++) {
        while (j >= 0 && word[j] != word[i-1])
            j = fail[j];
        fail[i] = ++j;
    }
}

int kmp() {
    int n = strlen(word);
    int l = strlen(text);
    int cnt = 0;

    for (int i = 0, m = 0; m < l; m++) {
        while (i >= 0 && word[i] != text[m])
            i = fail[i];
        if (++i >= n) {
            ++cnt;
            i = fail[i];
        }
    }
    return cnt;
}

int main() {
    int n;

    scanf("%d", &n);

    while (n--) {
        scanf("%s", word);
        scanf("%s", text);
        mkFail();
        printf("%d\n", kmp());
    }

    return 0;
}










0 件のコメント:

コメントを投稿