Wildcard Matching

‘?’ Matches any single character.
‘*’ Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch(“aa”,”a”) → false
isMatch(“aa”,”aa”) → true
isMatch(“aaa”,”aa”) → false
isMatch(“aa”, “*”) → true
isMatch(“aa”, “a*”) → true
isMatch(“ab”, “?*”) → true
isMatch(“aab”, “c*a*b”) → false

整个想法是这样的:

我们用两个指针,一个sp和一个pp来扫描两个数组。
还有两个变量一个是pstar用来记录p中star的位置,准确的来说是*后面那个数的位置,还有一个sstar记录同样的*对应在s中的位置。
pstar和sstar都被设为-1,这样我们就知道是否发现过*了。

举个例子
s = abaaacdef
p = ab**ac?**
扫描过程是这样的,首先判断是否s和p是否是空集,如果s是空集,那么p要么是空集要么全是*,不然就是false。p是空集则s必须是空集才是match。

然后sp和pp同时从0开始,如果两个指针指向的值相等或者sp指向的是?则两指针同时++。

如果pp发现了一个*,则pstar指向*之后的位置,这时sstar记下当前sp指向的值,然后向后扫,在上面的例子中,如果又发现*,则pstar被更新而sstar还是指向第二个出现的”a”.

然后sp从sstar,pp从pstar开始同时往后扫,如果在下一个*或者结束前(注意结束也看成一个字符来match即可)。没有完全match上,则pp退回pstar,而sp从这个sstar的下一个值sstar++开始搜,再继续搜直到match或者s搜完为止。以上例子中先比较ac aa, fail,然后sstar++再比较还是ac aa,最后直到sstar再次++比较ac?和acdmatch上了。

这时候pp又碰到了*,于是再继续,当我们发现p中最后的字符是一个或多个*,则可以直接结束。

public class Solution {
    public boolean isMatch(String s, String p) {
        if (s.length() == 0) {
            if (p.length() == 0)
                return true;
            for (int i = 0; i < p.length(); i++) {
                if (p.charAt(i) != '*')
                    return false;
            }
            return true;
        }
        if (p.length() == 0) {
            return (s.length() == 0);
        }
        int sp = 0, pp = 0; //two index
        int star = -1, sstar = -1; //position of the last star
        while( sp < s.length()) {
            if (s.charAt(sp) == p.charAt(pp) || p.charAt(pp) == '?') { // if match
                pp++;
                sp++;
                if (pp == p.length()) {// if end char does not match.
                    if (sp != s.length()) {
                        if (star == -1)
                            return false;
                        pp = star;
                        sstar++;
                        sp = sstar;
                    }
                    else {
                        return true;
                    }
                }
            }
            else if (p.charAt(pp) == '*') {
                pp++;
                star = pp;
                sstar = sp;
                if (star == p.length())
                    return true;
            }
            else { // if no match (need to deal with end char)
                if (star == -1)
                    return false;
                pp = star;
                sstar++;
                sp = sstar;
            }
        }
        if (sp == s.length() && pp == p.length())
            return true;
        for (int i = pp; i < p.length(); i++) {
            if (p.charAt(i) != '*')
                return false;
        }
        return true;
    }
}

We can have some preprocessing to remove all the continuous * to accelerate the process.

public class Solution {
    public boolean isMatch(String s, String p) {
        //remove all the continuous *
        if (s.length() == 0 && p.length() == 0)
            return true;
        if (p.length() == 0)
            return false;
        StringBuilder tmp = new StringBuilder();
        tmp.append(p.charAt(0));
        for (int i = 1; i < p.length(); i++) {
            if (!(p.charAt(i) == '*' && p.charAt(i-1) == '*'))
                tmp.append(p.charAt(i));
        }
        p = new String(tmp);
        int sstart = -1; // if no previous *
        int spoint = 0;
        int pstart = -1; 
        int ppoint = 0;
        while(spoint != s.length()) {
            if (ppoint < p.length() && (s.charAt(spoint) == p.charAt(ppoint) || p.charAt(ppoint) == '?')) { // if ppoint > p.length, send it to mismatch branch
                spoint++;
                ppoint++;
            }
            else if (ppoint < p.length() && p.charAt(ppoint) == '*') { // if found a *, then start from current position, previous * already matched, start from current location
                sstart = spoint;
                ppoint++;
                pstart = ppoint;
                if (ppoint == p.length()) // if this is the ending *, then match is done.
                    return true;
            }
            else {// if mismatch
                if (pstart == -1) // no previous *
                    return false;
                ppoint = pstart;
                sstart++;
                spoint = sstart;
            }
        } 
        if (ppoint == p.length() || (ppoint == p.length() -1 && p.charAt(ppoint) == '*')) //when s hit the end, if p is the end or p is the last '*' (we already remove all the continuous *), then we return true otherwise false; 
            return true;
        return false;
    }
}

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s