LeetCode Subsets I and II: DFS, Iterative (Bit Operation), Duplicate Handling

Overview

3 Methods including DFS, Bit Operation, Pure Iterative are discussed to solve LeetCode Subsets I and II, and key facts are explained to remove duplicates.

LeetCode Subsets I and II

(1)

Given a set of distinct integers, S, return all possible subsets.

Note:

  • Elements in a subset must be in non-descending order.
  • The solution set must not contain duplicate subsets.

For example,
If S = [1,2,3], a solution is:

[
  [3],
  [1],
  [2],
  [1,2,3],
  [1,3],
  [2,3],
  [1,2],
  []
]

(2)
The further question asks you to find all the unique subsets when there are repeated numbers in the original set.
For example,
If S = [1,2,2], a solution is:

[
  [2],
  [1],
  [1,2,2],
  [2,2],
  [1,2],
  []
]

Analysis: DFS, Iterative (Bit Operation), Duplicate Handling

When there are no duplicates in the input, the key idea to generate all subsets is that for each number from the input, we have two choices of whether put it into the subset or not and thus generate “2^n” number of total subsets. This idea could be implemented by three different approaches:

  1. DFS: Standard backtracking approach. Nothing new, but follow the standard DFS to code it up
  2. Bit Operation: This is quite elegant to use bit operation. Since we know the total number of subsets anyways, and for the k-th subset, we can use the binary representation of k to indicate which number is in this subset or not. That is, since k’s value ranges from o to 2^n, the binary representation for k covers all different combination of 0/1 for each of the n numbers, if j-th bit of k is 1, it indicate that we need to put S[j] into k-th subset, otherwise, not, and it covers all subsets without missing any one or adding any duplicate subsets. The only disadvantage of this approach is that I found it difficult to extend it to solve Subsets II with duplicate in the input, but method 1 and 3 could be modified a little bit to natural solve both Subsets I and II, anyone have better ideas to extend this method to solve Subsets II please share your idea and help! Thanks!
  3. This is again an iterative approach but from another angle. Similar to the iterative method (the 3rd method) in “LeetCode Permutations I and II“, starting from the empty subset, we can extend the already added subsets to new ones by adding the next number in the input to the existed subsets. That is, by definition, suppose we already have subsets of S[2..n], then to generate the subsets of S[1…n], we simply add S[1] to all the subsets of S[2…n] and together with those subsets of S[2…n], all of them are the subsets of S[1…n] which is exactly twice many as for S[2..n].

Well, now we have the above 3 methods, we now talk about how to handle the input which has duplicates. The essential key facts about how to handle the duplicates is as follows:

For any number S[i] appearing t times in the input, we only have (t + 1) choices whether put none of them into the subsets, or 1 number in the subsets, or 2 numbers of S[i] into subsets, or … all of them (i.e., there are t S[i] in the subsets) into the subsets. The big difference is that, if the t numbers are different from each other, then we have 2^t choices.

As long as we keep the above fact and keep it true in our coding, we can avoid generating duplicated subsets in the results. And there are two different implementations from two different angle to achieve the above fact.

(1) The essential thing we need to achieve is just try the number S[i] which is appearing t times for (t+1) combinations instead of 2^t. So just keep some code to do this: if two adjacent numbers say A and B are equal, then only when A is used, B could be used normally to try either putting B into the subsets or not, otherwise, B can never be put into any subsets Always keep this in mind, then there won’t be any duplicated subsets. This angle would be integrated into the DFS approach. Let’s see an example S = [1,2,2], when doing enumerating (backtracking), say in the current step, we already select 1 into the subset, we need to consider whether or not to put 2 into subsets, since at current point, this is the first 2, and we can try normally and one is to put 2 into subsets, and the other is not to use 2 in subsets, now we are, need to consider the second 2, for those subsets do not contain the first 2, we cannot use the second 2 either. Why? Because we have already tried the first 2 to generate the subset containing a single 2, if we try to put the second 2 into those subsets containing no 2s, then we actually generate a duplicated subset with a single 2 in it. This is what it really means by B could only be used when A is already used. The following DFS implementations is accepted by LeetCode OJ to pass both the Subsets I and II problems:

class Solution {
public:
vector<int> ans;
vector<bool> inSet;
vector<vector<int>> results;
void dfs(int i, int n, const vector<int> &S) {
    if (i >= n) { results.push_back(ans); return; }
    if (i == 0 || S[i] != S[i - 1] || inSet[i - 1]) {
        inSet[i] = false;
        dfs(i+1, n, S);

        ans.push_back(S[i]);
        inSet[i] = true;
        dfs(i+1, n, S);
        ans.pop_back();
        inSet[i] = false;
    }
    else  {
        inSet[i] = false;
        dfs(i+1, n, S);
    }
}

vector<vector<int> > subsetsWithDup(vector<int> &S) {
    sort(S.begin(), S.end());
    inSet.resize(S.size());
    fill (inSet.begin(), inSet.end(), false);
    dfs(0, S.size(), S);
    return results;
}
};

(2) Another different angle to remove duplicates is to treat the continues duplicates number S[i] which appears t times as a whole, and this whole thing have 1 + t choices for us, to put 0..t of them into the subsets. This implementation could be integrated into the 3rd approach fro Subsets I discussed previously and the following iterative implementations is accepted by LeetCode OJ to pass both the Subsets I and II problems:

class Solution {
public:
vector<vector<int> > subsetsWithDup(vector<int> &S) {
    sort(S.begin(), S.end());
    vector<vector<int>> vv(1);
    int i= 0, cnt = 0;
    int n = S.size();
    while (i < n) {
        cnt = 1;
        while (i + cnt < n && S[i] == S[i + cnt]) ++cnt;
        for (int k = vv.size() - 1; k >= 0; --k) {
            vector<int> tmp = vv[k];
            for (int j = 1; j <= cnt; ++j) {
                tmp.push_back(S[i]);
                vv.push_back(tmp);
            }
        }
        i += cnt;
    }
    return vv;
}
};

And note this is quite similar and consistent with the 3rd method) in “LeetCode Permutations I and II

(3) The final code is the implementation of the bit operation approach, unfortunately, I current do not know how to directly extend this approach to handle duplicates, the following code is accepted by LeetCode OJ only to poss the Subests I problem:

vector<vector<int> > subsets(vector<int> &S) {
    sort(S.begin(), S.end());
    int n = S.size();
    int m = pow(2, n);
    vector<vector<int>> ret(m, vector<int>());
    for (int i = 0; i < m; ++i) {
        for (int j = 0; j < n; ++j) {
            if (i & (1 << j)) ret[i].push_back(S[j]);
        }
    }
    return ret;
}

Summary

3 Methods including DFS, Bit Operation, Pure Iterative are discussed to solve LeetCode Subsets I and II, and key facts are explained to remove duplicates.

Written on April 27, 2013