Interview

Multiple Solutions to an Interview Problem

In a binary search tree, one node violates the properties of a binary search tree. Find that node.

Published

In a binary search tree, one node violates the properties of a binary search tree. Find that node.

Problem Analysis #

First, let’s review the properties of a binary search tree:

  1. If its left subtree is not empty, the values of all nodes in the left subtree are less than the value of its root node.
  2. If its right subtree is not empty, the values of all nodes in the right subtree are greater than the value of its root node.
  3. Its left and right subtrees are also binary search trees.

If we directly and recursively check whether every node in the binary search tree satisfies the properties above, we can certainly solve the problem, but the efficiency is surprisingly low. In fact, the result of an inorder traversal of a binary search tree is exactly an ascending sequence. We only need to check the elements in this sequence that violate ascending order to find the node in the binary search tree that does not satisfy the properties above.

Therefore, the first problem to solve is inorder traversal of a binary tree.

Inorder Traversal of a Binary Tree #

According to the definition of a binary tree, we can define the following data structure:

ocaml
type 'a btree =
    | Nil
    | Node of ('a * 'a btree * 'a btree)
;;

For this data structure, we only need to consider two cases:

  1. The current node is Nil.
  2. The current node is Node.

The corresponding inorder traversal program is as follows:

ocaml
let to_inorder tree =
    let rec aux tree ls =
        match tree with
        | Nil -> ls
        | Node (d, l, r) -> aux l (d :: (aux r ls))
    in
    aux tree []
;;

Or, expressed in the C++ language we are familiar with:

cpp
template<typename T>
struct btree
{
    T data;
    btree *left;
    btree *right;
};

template<typename T>
void ToInOrder(btree<T> *node, std::vector<T> &container) {
    if (node) {
        ToInOrder(node->left, container);
        container.push_back(node->data);
        ToInOrder(node->right, container);
    }
}

Through inorder traversal of a binary tree, we can obtain the sequence we need; then by checking this sequence, we can find the node that does not satisfy the properties of a binary search tree.

Using Only O(1) Space #

The method described earlier requires O(n) space to store the result of the inorder traversal, but in reality we do not need this sequence. We only need the value of the previous element in the sequence. Or, to go one step further, when we perform an inorder traversal of the binary search tree, if we can know the previous node of the current node, we can verify whether the current node satisfies the properties of a binary search tree.

To do this, we need to know the last node of a subtree. In that case, the previous node of the root is the last node of the left subtree; the previous node of the right subtree is the root; and the previous node of the left subtree is the previous node of the current subtree.

ocaml
let check tree =
    let rec do_check tree previousNode =
        let latest tree =
            let rec latest_aux tree parentNode =
                match tree with
                | Nil -> parentNode
                | Node (_, _, r) -> latest_aux r tree
            in
            latest_aux tree Nil
        in
        let rec check_current_right currentNode previousNode =
            match currentNode with
            | Nil -> Nil
            | Node (d, _, r) -> match previousNode with
                                | Nil -> do_check r currentNode
                                | Node (pd, _, _) -> if pd >= d then
                                                        currentNode
                                                     else
                                                        do_check r currentNode
        in
        match tree with
        | Nil -> Nil
        | Node (_, Nil, _) -> check_current_right tree previousNode
        | Node (_, l, _) -> match (do_check l previousNode) with
                            | Nil -> check_current_right tree (latest l)
                            | Node _ as ill_node -> ill_node
    in
    do_check tree Nil
;;

In reality, however, while traversing the left subtree, we can already know which node is the last node of the left subtree. Therefore, with a small modification, we can discard the latest function.

ocaml
let check tree =
    let rec do_check tree previousNode =
        match tree with
        | Nil -> (true, previousNode)
        | Node (d, l, r) ->
            let (l_valid, l_node) = do_check l previousNode in
            if l_valid then
                match l_node with
                | Nil -> do_check r tree
                | Node (pd, _, _) -> if pd >= d then
                                        (false, tree)
                                     else
                                        do_check r tree
            else
                (l_valid, l_node)
    in
    match (do_check tree Nil) with
    | (true, _) -> Nil
    | (false, n) -> n
;;

The corresponding C++ code is as follows:

cpp
template<typename T>
static std::tuple<bool, btree<T> *> DoCheck(btree<T> *tree, btree<T> *previousNode)
{
    if (tree) {
        bool leftValid;
        btree<T> *leftInvalidNode;
        std::tie(leftValid, leftInvalidNode) = DoCheck<T>(tree->left, previousNode);
        if (leftValid) {
            if (leftInvalidNode) {
                if (leftInvalidNode->data >= tree->data) {
                    return std::make_tuple(false, tree);
                } else {
                    return DoCheck<T>(tree->right, tree);
                }
            } else {
                return DoCheck<T>(tree->right, tree);
            }
        } else {
            return std::make_tuple(leftValid, leftInvalidNode);
        }
    } else {
        return std::make_tuple(true, previousNode);
    }
}

template<typename T>
inline btree<T> * Check(btree<T> *tree)
{
    bool valid;
    btree<T> *invalidNode;
    std::tie(valid, invalidNode) = DoCheck<T>(tree, nullptr);
    if (valid) {
        return nullptr;
    } else {
        return invalidNode;
    }
}

Clearly, this code is far more complex than the initial inorder traversal code. So how can we enjoy the simplicity of inorder traversal code while also implementing this kind of complex functionality?

Introducing State #

Some people believe that pure functions should not have state. I do not agree with that view. As long as a function always produces the same output for a given input, it is a pure function. In appropriate situations, introducing state can greatly simplify program design. For the problem raised above, we only need to introduce state to store the value of the previously visited node, and then we can solve the problem with very little code.

The most straightforward idea is as follows:

ocaml
let check tree =
    let p = ref Nil in
    let i = ref Nil in
    let rec check_aux tree =
        match tree with
        | Nil -> ()
        | Node (d, l, r) -> check_aux l;
                            match !p with
                            | Node (pd, _, _) when pd >= d -> i := tree
                            | _ -> p := tree; check_aux r
    in
    check_aux Nil;
    !i
;;

Converting this into C++ code is a bit troublesome. The C++ standard does not allow nested function definitions, so the relatively direct idea is to use a global variable:

cpp
static btree<int> *previousNode;

static btree<int> *DoCheck(btree<int> *tree)
{
    if (tree) {
        btree<int> *r = DoCheck(tree->left);
        if (r) {
            return r;
        } else {
            if (previousNode && previousNode->data >= tree->data) {
                return tree;
            } else {
                previousNode = tree;
                return DoCheck(tree->right);
            }
        }
    } else {
        return nullptr;
    }
}

inline btree<int> *Check(btree<int> *tree)
{
    previousNode = nullptr;
    return DoCheck(tree);
}

However, using a global variable causes two problems. The first is that it is not thread-safe; the second is that a global variable cannot be generic. The first problem can be solved by adding the thread_local keyword, but what about the second problem? We can imitate the idea of closures and use a functor to bind the function and data together. The specific code is as follows:

cpp
template<typename T>
class CheckBinarySearchTreeFunctor
{
public:
    inline btree<T> * operator() (btree<T> *tree)
    {
        previousNode = nullptr;
        return DoCheck(tree);
    }
private:
    thread_local btree<T> *previousNode;
    btree<T> * DoCheck(btree<T> *tree)
    {
        ...
    }
};

To avoid creating a new object every time we use the functor, we can also add the singleton pattern to this functor class:

cpp
template<typename T>
class CheckBinarySearchTreeFunctor
{
    ...
public:
    static CheckBinarySearchTreeFunctor instance;
};

CheckBinarySearchTreeFunctor CheckBinarySearchTreeFunctor::instance;

Is this our ultimate solution?

Going Further #

The solution above is already quite excellent, but it is still not the ultimate solution. Why? Because the method above mixes the general algorithm for inorder traversal of a binary tree with the special algorithm for determining whether a binary search tree is valid. Although the latter applies the former, it cannot reuse the former algorithm; instead, it implements it again itself. This made me realize that there is still room to improve this algorithm.

Think about the for_each function in the STL, which can execute an action while traversing a data structure. Following this idea, I should have added an iterator for inorder traversal to the binary tree. However, that would be too heavyweight for an interview problem, so it is better to directly transform the inorder traversal function:

cpp
template<typename T>
void InOrderVisit(btree<T> *tree, std::function<void(btree<T>*)> action)
{
    if (tree) {
        InOrderVisit(tree->left, action);
        action(tree);
        InOrderVisit(tree->right, action);
    }
}

In this way, the function that determines whether a binary search tree is valid can be written as follows:

cpp
template<typename T>
btree<T> *Check(btree<T> *tree)
{
    btree<T> *previousNode = nullptr;
    btree<T> *invalidNode = nullptr;
    InOrderVisit(tree, [&](btree<T> *node) {
        if (previousNode && previousNode->data >= node->data) {
            invalidNode = node;
        }
        previousNode = node;
    });
    return invalidNode;
}

The earlier ToInOrder method can also be written like this:

cpp
template<typename T>
std::vector<T> ToInOrder(const btree<T> *tree)
{
    std::vector<T> container;
    InOrderVisit(tree, [&container](const btree<T> *node) {
        container.push_back(node->data);
    });
    return container;
}