LeetCode Longest Consecutive Sequence: Hashing the Boundary

Overview

LeetCode Longest Consecutive Sequence: use a hash table to associate the boundary value with the, for each v, we check v + 1 and v – 1 in constant time.

LeetCode Longest Consecutive Sequence

Given an unsorted array of integers, find the length of the longest consecutive elements sequence.

For example,
Given [100, 4, 200, 1, 3, 2],
The longest consecutive elements sequence is [1, 2, 3, 4]. Return its length: 4.

Your algorithm should run in O(n) complexity.

Solution: Hashing the Boundary

One naive approach is use a direct addressing table, hash each value into the direct addressing table, suppose the maximum number is M while the minimum value is m, then linear scan from m to M and count maximum number of consecutive numbers. Note this algorithm is Pseudo-Linear the time complexity is O(M-m) not O(n).

The correct solution is a bit counter intuitive. The key point is that for each number say Val in the array, we only need to determine whether its neighbor (Val-1 and Val+1) is in the array and  for a consecutive sequence we only need to know the the boundary and the length information, we should utilize hash map.The pseudo code is like the follows and I will give more details after the pseudo code:

int longest(vector &num) {
    hashmap hm();
    foreach val in num {
        if val is already in hm
        then continue;

        put (val, 1) in the hashmap;
        if(val - 1 is in the hashmap)
              combine the sequence with val-1 as the right boundary with val
        if(val +1 is in the hashmap)
              combine the sequence with val+1 as the left boundary with val

        determine the maximum length
    }
    return the longest sequence;
}

}

There are some more clear algorithm using two hash maps. But in fact one is enough. The key observation is that (1) we don’t need  middle number information in a consecutive sequence. (2) The left bound or right bound information don’t have to be maintained explicitly by using two hash map one for left boundary and the other for right boundary. The following code is accepted by LeetCode OJ to pass this Longest Consecutive Sequence problem:

class Solution {
    public:
        int longestConsecutive(vector &num) {
            unordered_map<int, int> boundary;

            int len = 0;
            for (int v : num) {
                if (boundary.find(v) != boundary.end())
                    continue;
                int lLength = boundary.find(v-1) == boundary.end() ? 0 : boundary[v-1];
                int rLength = boundary.find(v+1) == boundary.end() ? 0 : boundary[v+1];

                int length = lLength + rLength + 1;

                len = max(len, length);
                boundary[v-lLength] = length;
                boundary[v+rLength] = length;
                boundary[v] = length;
            }
            return len;
        }
};

Another alternative approach: By using hash table, one can check if a number is in the given sequence in constant time. So the solution could be like this, from left to right of the given sequence, say the current number is N, then we continuously check N-1, N-2…. until we encounter say N – j which is not in the sequence, similarly we check N+1, N+2 … until we encounter say N + k which is not in the sequence. So the longest consecutive sequence contain N is of length k – j – 1. Remember to set all the number in this sequence as visited so we don’t need to repeatedly check them again. This approach is O(N) time because each consecutive sequence might only be checked twice linearly. The following code implements this idea:

int longestConsecutive(vector &num) {
    unordered_map<int, bool> existed;
    unordered_set checked;

    for (int v : num)
        existed[v] = true;

    int len = 0;
    int localLen = 1;
    int val = 0;
    for (int v : num) {
        if (checked.count(v))
            continue;

        val = v;
        localLen = 1;
        checked.insert(v);
        while (existed.find(++val) != existed.end()) {
            ++localLen;
            checked.insert(val);
        }
        val = v;
        while (existed.find(--val) != existed.end()) {
            ++localLen;
            checked.insert(val);
        }
        len = max(len, localLen);
    }
    return len;
}

Note, of course, Sorting can work in O(N + NlogN) time.

Summary

LeetCode Longest Consecutive Sequence: use a hash table to associate the boundary value with the, for each v, we check v + 1 and v – 1 in constant time.

Written on February 23, 2013