KMP related

有一种String,是把一个更短的String重复n次而构成的,那个更短的String长度至少为
2,输入一个String写代码返回T或者F
例子:
“abcabcabc” Ture 因为它把abc重复3次构成
“bcdbcdbcde” False 最后一个是bcde
“abcdabcd” True 因为它是abcd重复2次构成
“xyz” False 因为它不是某一个String重复
“aaaaaaaaaa” False 重复的短String长度应至少为2(这里不能看做aa重复5次)

要求算法复杂度为O(n)

Applying the KMP algorithm,

given the example of ababab. First calculate the failure function according to the KMP algorithm

public int [] failure(String s) {
	int [] f = new int[s.length() + 1];
	f[0] = f[1] = 0;
	int j = 0; // record  the last possible position to extend
	for(int i = 1; i < f.length - 1; i++) {
		if (s.charAt(i-1) == s.charAt(k)) {
			f[i+1] = k+1;
			k++;
		}
		else if (k != 0) {
			k = f[k];
			i--; // try next k with same i
		}
		else {
			f[i+1] = 0;
		}
	}
	return f;
}

The same method can be used to solve the string concatenation problem above. Assume we have a string of “ABABAB”, then if we get the string that is both prefix and suffix of the original string, and sort them according to the length. we get
1. ABAB
2. AB
3. empty string

Then according to these suffix/prefix string, we get another set of strings which, after being inserted in front of the string will ended up the original string responsively We call them “augmenting string”.

1.AB
2.ABAB
3.ABABAB

Because each of these string will be the prefix of the original string, they are also part of the suffix string according to the prefix/suffix string. That means now the suffix/prefix contains at least two copies of the “augmenting” string as a prefix (since it’s also a prefix of the initial string) and so on. Of course the suffix/prefix under question needs to be long enough. In other words, the length of a successful “candidate” must divide with no remainder the length of the initial string.

So what we need to do is to check the augmenting string one by one from the shortest all the way to the longest (the original string), and if any of them can be divided by the length of the suffix/prefix string, it is the concatenating string. That is what the failure function was designed for. The failure function for S = “ABABAB” would be F = {0001234}
So what we need to do is simply follow the failure function from the tail n = 4;

public static String minConcat(String s) {
	int [] f = failure(s);
	int n = f[s.length()];
	String concast = s;
	while(n != 0) {
		if (n%(s.length() - n) == 0) { //n should be greater thatn s.length() - n if not, the statement will not be 0.
			concast = s.substring(0, s.length() - n);
		}
		n = f[n];
	}
	return concast;
} 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s