Let’s get some practice with binary tree operations. Here’s our plan:

Define a BT<E> class with a Node<E> inner class. BT<E> will represent manage the tree, which will be mode out of Node<E>s, with null representing an empty subtree.
Define size() and height() methods, first on Node<E>s and then on the BT<E> itself.
Define an isBST() method, to detect whether or not the tree in question is a binary search tree, i.e., whether or not it has the search property.

Setting up the classes

To start with, let’s define our BT<E> class. We’ll want to use the type parameter E for elements. A BT<E> has just one field: the root, a Node<E>. The Node<E> class will have three fields: one of type E for holding the data, and two more Node<E>s for the left and right children. (Mind the compiler warnings here—it’s easy to forget a <E> and get some yellow squiggles.)

Defining the classes

We’ll define Nodes as an inner class, but this time we’ll do so differently: you want to make a public static inner class with its own generic type parameter, as in:

public class BT<E> {
  public static class Node<E> {
    // node definitions: data element, left and right child, constructors
  }
  
  // tree definitions: root `Node<E>`, constructor
}

Precisely which constructors for Node<E> you define are up to you—define all that are convenient.

Why `public static class Node<E>` and not `class Node`?

So far, our inner classes have been package private instance classes. That is, we declared linked lists using inner classes as:

public class LL<E> implements Iterable<E> {
  class Node {
    private E data;
    private Node next;

    public Node() {
    }

    public Node(E data, Node next) {
      this.data = data;
      this.next = next;
    }
  }

  private Node head; // null means the empty list

  public LL() {
    head = null;
  }

  // ...
}

Such inner classes are completely internal to each LL<E> instance. That’s generally the right way to do things, and how you should do it on the homework.

But in this recitation, we’ll do things a little differently. We need a convenient way to add and remove nodes from the tree. But if we use an instance-specific inner class, as above, it’s pretty complicated to create Nodes.

By creating a static inner class, we can create Nodes outside of any given instance. That’s useful for testing our code, which we’ll do once we have some meaningful code to test.

Defining `size` and `height`

A tree’s size is the number of nodes in it: the empty tree (i.e., with null root) has size 0; the tree with just the a root node has size 1, and the following tree has size 5:

   b
  / \
 a   d
    / \
   c   e

A tree’s height is its number of “levels”—the tree above has height 3. Alternatively, a tree’s height is the number of nodes in the longest path from the root: again 3, from the path b-d-e. The empty tree has height 0; a tree with just a root has height 1 (contra anything else you might have heard). This tree has a height of 4:

   b
  / \
 a   d
    / \
   c   e
      /
     f

Implement methods for size and height on BT<E>; to do so, you’ll need similar methods on Node<E>. How should they work?

Defining `size`

First, add a declaration to BT<E> for the size() function, returning an int. It’s a simple method:

If the root is null, then you should return 0.
If the root is not null, then you should return root.size().

Well… that second step is begging the question. Let’s define Node.size(). Suppose we have a Node<E>. What is the size of the subtree from that node? Well:

The node itself contributes 1 to the size.
The left child has some size.
The right child has some size.

So the subtree from that node has size 1 + left.size() + right.size(). But there’s a catch: what if left or right are empty? Just as BT.size() tests to see if root == null, we’ll have to test to see if left or right are null. (Hint: the ternary expression cond ? then : else is helpful here!)

Defining `height`

Once you have size defined, it shouldn’t be too bad to define height(). Its signature is similar: no arguments, returns an int. But rather than adding 1 to the sum of the child heights, we want to do something slightly different at each node: we add 1 to the maximum child height. (Hint: check out Math.max!)

Testing everything

We need to test our code! Create a new JUnit test class in Eclipse, and add tests for size() and height() on:

The empty tree
The singleton tree, with a root that is also a leaf, i.e., without children
The two tree examples above

In writing this code, you’ll have to manually write Node<String> calls. For example, we can create the following tree:

   c
  / \
 a   d
    /
   e

By calling:

BT<String> t = new BT<>();

t.root = new Node<>("c", 
                    new Node<>("a", 
                               null, 
                               null), 
                    new Node<>("d", 
                               new Node<>("e", 
                                          null,
                                          null), 
                               null))

Depending on the constructors you define for Node<E>, your code may look slightly different. (Note that we _couldn’_t do this if we hadn’t made Node<E> a public static inner class. Remove the modifiers and see what Eclipse says!)

Checking for the BST property

Finally, let’s define a predicate that determines whether or not a tree is a binary search tree, i.e., whether it has the search property. Let’s recall the definition of the search property:

The empty tree is a search tree.
Given a tree t rooted at a node n with value v, left child l, and right child r, we have that t is a search tree when:
- every value in l is strictly less than v
- every value in r is strictly greater than v
- l is a search tree
- r is a search tree

There are two ways to implement this function, and one is much more efficient then the other.

Direct implementation

One approach is to directly implement a function that returns true when the search property holds. Here’s the plan:

Recursively process each node.
Empty nodes (i.e., null) are trivially search trees.
For every non-empty node, traverse its children to make sure the values in the left subtree are all less than the value and the values in the right subtree are all greater than the value.

Such a solution works fine. You’ll need some helper methods in Node<E> to do the “all greater than” and “all less than” searches.

Narrowing bounds

The approach above has a disadvantage: a node in level h will be visitedh times, giving us a fairly poor complexity. We can do the checking in O(n) time if instead of traversing the entire subtree, we keep track of our bounds and recursively narrow them. Suppose we’re checking to see that the following tree is a BST (hint: it isn’t):

   a
  / \
 b   d
    / \
   c   e

When we look at the root a, we have no constraints When we look at b, we know that everything must be less than a—that is, a is our new upper bound. At this point we can confidently say that our tree is not a search tree, because b is greater than a in the conventional alphabetical ordering, but let’s continue just to get the feel for things.

When we look at d’s subtree, we know that everything must be greater than a. The left subtree of d must be between a (our lower bound) and d (our new upper bound). The right subtree of d must be greater than d (but has no upper bound).

So here’s a plan:

Start with lo and hi bounds that are null.
Each time you visit a node, you first check that its data is greater than lo and less than hi. (Now, if lo or hi is null, there’s nothing to check.)
Check the left subtree using data as the new hi.
Check the right subtree using data as the new lo.

Implement it!

Go ahead and implement both of these strategies. You’ll need to change your type declarations so that you have BT<E extends Comparable<? super E>>, in order to use compareTo(E).

The direct strategy demands an isBSTDirect() method on BT<E> and a plain isBST() method on Node <E>, with some extra helpers. The narrowing bounds method requires an isBST() method on BT<E> and a fancier isBST(E lo, E hi) method on Node<E>. Good luck!

How does this relate to the homework?

The binary trees you’ve implemented today give you a great start for thinking about the homework. You’ll want to make a few changes on the homework:

Node<E> can become a package-private or private inner class on BT<E>, since you’ll have the insert(E) method to add elements rather than manually building nodes.
You shouldn’t need isBST(), but it might be helpful for testing.

CS 284-A Data Structures — Spring 2022

Week 10: Binary Tree Operations

Setting up the classes

Defining the classes

Why `public static class Node<E>` and not `class Node`?

Defining `size` and `height`

Defining `size`

Defining `height`

Testing everything

Checking for the BST property

Direct implementation

Narrowing bounds

Implement it!

How does this relate to the homework?

CS 284-A Data Structures — Spring 2022

Week 10: Binary Tree Operations

Setting up the classes

Defining the classes

Why public static class Node<E> and not class Node?

Defining size and height

Defining size

Defining height

Testing everything

Checking for the BST property

Direct implementation

Narrowing bounds

Implement it!

How does this relate to the homework?

Why `public static class Node<E>` and not `class Node`?

Defining `size` and `height`

Defining `size`

Defining `height`