Distinct Subsequences

首先扫描S找与T的第一个字符相同的字符,每次找到这个字符就从这个字符开始,递归地求S和T剩下的串,将返回的值加在一起并返回上一层。T为空串时,返回1。因为空串本身是另外一个串的一个子序列。这个算法实现简单,但是大集合超时。

另外一种办法就是dp,用dp[i][j]表示子数组是前i个字符,母数组是前j个字符的match的个数。那递推关系是怎么样的呢?如果母字符串增加一个j,子字符串不变还是为[i],如果增加的母字符S[j]和子字符的最后一个字符T[i]不相等,那么match数目不变,因为还是match母字符前j-1个字符,这时候dp[i][j]= dp[i][j-1],而如果母字符串中新增加的这个字符S[j]和子字符的最后一个字符T[i]相等,那么就可以用S[j]mtachT[i],剩下的S[j-1]matchT[i],当然也可以不用S[j]match而还是用S[j-1]match,所以dp[i][j] = dp[i][j-1] + dp[i-1][j-1].
r a b b b i t
1 1 1 1 1 1 1 1
r 0 1 1 1 1 1 1 1
a 0 0 1 1 1 1 1 1
b 0 0 0 1 2 3 3 3
b 0 0 0 0 1 3 3 3
i 0 0 0 0 0 0 3 3
t 0 0 0 0 0 0 0 3
继续优化就是可以用滚动数组而不用一整个二维数组来做dp,这样可以省下空间。(只要用一个S.length()的数组一直overwrite就可以),但是要注意我们每次更新因为是从左往右,如果用滚动数组就会被覆盖掉原来的值,所以我们需要用两个变量,一个temp,一个last,每次temp保存当前的值,last是当前循环用的值。还有就是要注意两个S和T都要有一行来确保null的情况,这样计算才正确(第一行所有都是1,每次循环第一个数要设成0,注意用temp保存原来的值。

public class Solution {
    public int numDistinct(String S, String T) {
        if (T.length() == 0)
            return 1;
        int match[] = new int [S.length() + 1]; // plus the null
        for (int j = 0; j <= S.length(); j++) // first row all 1
        {
            match[j] = 1; //when i = 0, t is 0, except s == 0, all other are 1 becase null is subsequance for all the s
        }
        
        for(int i = 0; i < T.length(); i++) 
        {   
            int temp = match[0]; // stored for next avoid being overwritten;
            match[0] = 0;
            for (int j = 1; j <= S.length(); j++) 
            {
                int last = temp;
                temp = match[j]; // stored for next avoid being overwritten;
                if (S.charAt(j-1) == T.charAt(i))// string charAt start at 0 but j start at 1
                    match[j] = last + match[j-1];
                else
                    match[j] = match[j-1];
            }
        }
        return match[S.length()];
    }
}

Second Submission:

The same idea, using 2d array.
What behind the idea is the dynamic programming. The T array start from 1 and scan through the S array, then T array goes to index 2 and scan the S array with the result from the first scan. It is a trade between time and space.

Pay attention to the first row when T index is 1. For every other row, if index S is 0, then the dp result will always be 0 because you can not match a longer T to a shorter S.

public class Solution {
    public int numDistinct(String S, String T) {
        int [][] dp = new int [T.length()][S.length()];
        if (T.length() == 0) 
            return 1;
        if (S.length() == 0) 
            return 0;
        dp[0][0] = (S.charAt(0) == T.charAt(0))? 1 : 0;
        
        for (int j = 1; j < S.length(); j++) { // init first line when T is one char  i == 0
            if (S.charAt(j) == T.charAt(0)) 
               dp[0][j] = dp[0][j-1] + 1;
            else
                dp[0][j] = dp[0][j-1];
        }
        
        for (int i = 1; i < T.length(); i++) {
            for (int j = 0; j < S.length(); j++) {
                if (j == 0)
                    dp[i][j] = 0;
                else {
                    if (S.charAt(j) == T.charAt(i))
                        dp[i][j] = dp[i-1][j-1] + dp[i][j-1];
                    else
                        dp[i][j] = dp[i][j-1];
                }
            }
        }
        return dp[T.length()-1][S.length()-1];
    }
}

Remember that for this question, we fix the index in T and all index in S to fill in the dp[][] step by step. Notice the number of matches for the empty strings, they are the initial conditions for the dp to start.

    public int numDistinct(String S, String T) {
        //start from the null string
        int [][] dp = new int [S.length() + 1][T.length() + 1];
        dp[0][0] = 1;
        //if j is none empty and i is empty, then 0 match
        for (int j = 1; j < T.length() + 1; j++)
            dp[0][j] = 0;
        //if i is none empty and j is empyt, then 1 match
        for (int i = 1; i < S.length() + 1; i++)
            dp[i][0] = 1;
        //fix the index in j and try increase the index in i to fill up dp[][] step by step
        for (int j = 1; j < T.length() + 1; j++) {
            for (int i = 1; i < S.length() + 1; i++) {
                if (S.charAt(i-1) == T.charAt(j-1))
                    dp[i][j] = dp[i-1][j-1] + dp[i-1][j];
                else 
                    dp[i][j] = dp[i-1][j];
            }
        }
        return dp[S.length()][T.length()];
    }

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s