# Trees

Trees are remarkably useful and powerful data structures, with many applications. Mohamed El Dawy explains.

In this article, I am going to give an introduction to trees. Trees are unique data structures that are used to store data in a tree-like shape. They have many useful applications, which include 3D programs, data compression, Web browsers and even dictionaries. All in all, trees are wonderful data structures. Let’s get started.

What are trees?

As I said, trees are a kind of data structure. If the phrase “data structure” is new to you, don’t worry. It just means the way you represent data in the memory of a computer. So, trees are no more than a method for storing data in the memory of a computer. Which type of data? And when should we use trees? This is what we are going to discuss in a moment.

Trees look like the following diagram. First we have a root node which contains one item of data. The root node has some children,  and each one contains one item of data. Each one of the children, in turn, has some children, and so on…

In essence, it looks something like this:

Fig.1 An example of a tree

Now, you may find this pretty weird. Why would someone want to represent data that way? Let’s assume we have a list of n numbers; what is the point of putting them in a tree? As we are about to see, there are some good reasons to do so. Additionally, in real life, there are several applications where trees are just the natural way to represent data.

First, before we go too far, let’s look at some theory. Any tree can be represented by a tree where each node has a maximum of two children. This is called a ”binary tree”. A famous method to accomplish this is called “Left child, right sibling,” but this is quite outside the scope of this article; check the further reading at the end of the article if you really need to know how to do it.

A binary tree looks something like this:

Fig 2. An example of a binary tree

Why did I introduce this? It has a very important implication. We don’t need to study general trees in depth, since any tree can be represented as a binary tree. It makes sense then that, from now on, we will only focus on binary trees. As I said, we are not limiting ourselves, because any tree can be represented by a binary tree.

First, let’s examine this fundamental problem: how can we represent a tree inside the computer memory?

{mospagebreak title=Representation}

We can think of the memory of a computer as just a huge one dimensional array (a really huge one, indeed!). Now, look at the trees we have just drawn above. How would they fit into the memory of a computer? That looks like a tricky question, but the answer is not that hard. In fact we can come up not just with one, but actually two, different representations.

Representation 1… Using an Array

The first way to represent a tree is quite simple. Since computer memory can be thought of as a one dimensional array, let’s represent the tree using a one dimensional array. Each node of the tree will be represented as an element of the array. The root of the tree will be at the very first location of the array. The rest of the nodes follow like reading the tree from top to bottom, and from left to right (just like you are reading this page).

So, for example, this tree..

Fig 3a. This tree…

Will end up looking like this…

 A B C D E F G

Fig 3b. will be represented using this array

Pretty simple, huh? Problems occur, though, when there are gaps in the tree… For example, let’s look at this tree…

Fig 4a. And this tree…

To represent it, we will have to waste some locations in the array, in order to represent the tree correctly. This is how the array will look. (Unused locations have been marked as ‘X.’ In real life, you can insert any invalid value to indicate an empty node)

 A B C D X E F X X X X X X G H

Fig 4b. will be represented using this array

The first X (after the D) is for the missing right child of B. Other invalid elements are for the missing children of D, E and the children of the missing child of B. This gives us seven wasted nodes to represent a tree with eight nodes, which is not very good at all.

So, what is good about this representation? First, it is easy to navigate the tree. Given any node, it is easy to find its parent, and its two children. How? Let’s assume the root of the tree is put at location 1 of the array. Now, if we have a node at location x, it is easy to show that its parent is at location x/2 (if x is odd, just ignore the fractional part), and its children are at locations x*2, and x*2+1.

Don’t believe this? Take a good look at the previous tree. Let’s look at node “C.” This is a node in location three of the array. The parent of this node should be at location 3/2 which gives us 1 (we ignore the fractional part, as just mentioned). The node at location 1 is “A.” “A” is indeed the parent of node “C.” Now, again “C” was at location three, so its children should be at locations six and seven. The nodes at locations six and seven are “E” and “F,” which are indeed the children of node “C.” Now, take a moment to try the same thing on node “F.” Did everything work out correct? Great. Let’s move on.

Now, what is bad about this representation? Basically, if the tree is not dense enough, considerable memory gets wasted. Let’s look at an extreme example. The eight node tree shown below requires an array of size 127. This is a serious memory. You can imagine that it gets even worse for bigger trees!

Fig 5. This is bad… really bad! Lots of memory gets wasted

The bottom line is, this representation is really good for very dense trees, but it gets really bad when the trees are not very dense. An example of when this representation is perfect, is when we have a heap. Heaps are special kinds of binary trees that are very dense (In fact, no memory gets wasted at all!). But, let’s leave them for another time.

So what do we do when the tree is not dense enough? This calls for a different representation, namely, the linked representation.

Basically this representation shows the tree nearly as we see it on paper. We define a new class “Node” to hold the data of a single node, and we put two references to the two children of this node. When a node has no left child, or right child, we can set its reference to null.
So, if we define a class like this…

class Node
{
int data; //or whatever
Node left;
Node right;
}

The tree is now a collection of nodes linked together. For example, this tree…

Fig 6a. Now, this tree…

Will be represented as 7 instances of Node, like this…

Fig 6b. could be represented using linked representation like this

Each node is represented by an instance of Node. Node has 3 members. Left (the small rectangle to the left), Data (the rectangle in the middle), and Right (the small rectangle to the right).

So, what is good about this representation? It doesn’t waste much memory even when the tree is not dense enough. But there are drawbacks. There is some extra storage wasted, of course, because of all the extra references. Another problem is that, given a node, it is not easy to find its parent. (Some tweaks can be done to the representation to allow this, but that’s another story).

Now, we know what a tree is, and we know how to represent it. It’s time to see trees in action. Let’s head directly to our first example, the dictionary.

{mospagebreak title=Dictionary example}

Let’s assume you are trying to write a program for a dictionary. The program reads a word from the user, and tells him the meaning of that word. But, here is a little twist. In our dictionary, we want to allow the user to add his own definitions at runtime. At any moment, the user can ask the program to add a word; he enters the word, and the definition, and it becomes part of the dictionary. We would also love to give the user the option to show all the words that start with a certain prefix. For example, the user can enter “cat,” and the program shows him all the words that start in the letters “cat.”

Now, how can we implement this without causing a cat-astrophe? Where will the words be stored? If you considered a sorted array, that’s a good idea, because you can then use binary search to find words very fast. However, there is a small problem. How would you add new words? This will not be an easy task. You will have to find where the word should go, and you will have to move all the subsequent elements down one location, which can become a real problem if you are dealing with millions of words.

Now, let’s assume that you decide to represent the dictionary as a sorted linked list. Insertion is much easier, because you can simply create a new node, and insert it at the correct location. However, searching becomes a disaster, because you can no longer use binary search to search a linked list. You will have to search the entire linked list one by one, which again can be catastrophic if you have millions of words.

So, what is the solution to this dilemma? Neither a sorted array nor a linked list works well. Each one has an advantage and a disadvantage. Wouldn’t it be great if we could put the dictionary in some sort of a data structure where both insertion and searching can be done fast enough? As surprising as it may seem, using trees can solve our problem in a simple and elegant way. But how? All we have is a list of words! It doesn’t even fit in a tree. Well…actually, it does. Let me explain how this is going to work.

First, we will construct the dictionary as a binary tree in the following manner. Any node will contain two strings, the word and its meaning. We will arrange the nodes such that the root will contain a word that comes alphabetically before all the words in the right subtree, and at the same time, it come alphabetically after all the words in the left subtree.

It doesn’t stop here. The rule continues recursively over every node in the tree. We will construct the tree in such a way that if you look at any node, you will find that the word in it comes alphabetically before all the words of the nodes to its right, and it comes alphabetically after all the nodes to its left. For example, look at this example…

Fig 7. A valid dictionary. If only real life dictionaries were like this!

This tree fulfills all the conditions we have just outlined. Have a look at each node. Look at all the nodes to its left (children, grand children and so on). They all come alphabetically before it. Look at those to its right. They all come after it.

But the following tree…

Fig 8. An invalid dictionary

…is not good! Why? The problem is with the word “castle.” It comes alphabetically before “dean”, but “castle” is to its right. It should have been on the other side of “dean.” This one doesn’t work.

So far so good? Great. We have just re-invented a very special kind of tree called “Binary Search Trees.”

{mospagebreak title=Putting it Together}

So, what is good about this weird representation we have? First, given a word to find, it is easy to search for it. How? Just start at the root. See if it is the word you need. If it isn’t, decide whether to go left or right. This is easy, because we know all the elements to the left come alphabetically before the element at the root, and all those to the right come alphabetically after the one at the root. So, with a single comparison, we can decide whether to go left or right.

Even better, because the same condition applies recursively, we can repeat the same at each step, till we either find the element we are looking for, or we hit a null pointer, where we know the element does not exist. This is as good as binary search (assuming the tree is balanced), because with every comparison, you reject half the remaining elements (by deciding to go left, you reject all the elements to the right, or vice versa), just as you would in a binary search.

So searching is easy. What about insertion? Insertion is pretty easy too. In fact, it is similar to searching. To insert a new word, we take the same steps we used to search for it (start at the root, decide to go left or right by comparing the word you are trying to insert with the word at the root, and repeat) When we find a null pointer, we insert a new node with the new word at this place.

So, insertion and searching are both easy. This is not something to take lightly.  As we saw earlier, neither linked lists nor arrays could offer the same kind of behavior. By the way, deletion is easy too, as is range searching (listing all words between 2 words). But both are outside the scope of this article.

So, let’s try to pull everything together. We start by writing the source code necessary to implement searching and insertion. Let’s start with searching, since it is the easier of the two.

class wordMeaningPair
{
String word;
String meaning;
wordMeaningPair left;
wordMeaningPair right;
}

class Dictionary
{
wordMeaningPair dict;

String lookup(String word)
{//given a word, return the meaning of it. Or null if not found
wordMeaningPair srchNode=dict; //start at top
while(srchNode!=null)
{
if(srchNode.word.compareTo(word)==0)
return srchNode.meaning;
if(srchNode.word.compareTo(word)<0)
srchNode=srchNode.right;
else
srchNode=srchNode.left;
}
return null;
}
}

All we do is look at the node, and decide whether to go left or right based on the comparison. If we find the word we are looking for, we return it immediately (this is the return statement inside the loop). If the while loop runs till the end (a null pointer is found), we know the word is not there, and we return null.

Insertion is only slightly trickier. Let’s have a look at the code. This method should be a member of the Dictionary class above.

void insert(String word, String meaning)
{
wordMeaningPair newWord=new wordMeaningPair();
newWord.word=new String(word);
newWord.meaning=new String(meaning);
newWord.left=newWord.right=null;
wordMeaningPair srchNode=dict;
wordMeaningPair prev=null;
if(dict==null)
{
dict=newWord;
return;
}

while(srchNode!=null)
{
prev=srchNode;
if(srchNode.word.compareTo(word)>0)
srchNode=srchNode.right;
else
srchNode=srchNode.left;
}
if(word.compareTo(prev.word)>0)
prev.right=newWord;
else
prev.left=newWord;
}

What does this code do? Actually, it’s very similar to the searching code. First, the code creates a new node, then it inserts the word and its meaning into the node. Up till that point, the node is totally isolated from the tree! It is not connected to anything. So the next thing we need to do is get it into place.

First, if the whole tree is empty (dictionary is null), we simply create a new tree containing the one newly created node, and return immediately. (This means that we let dict point to newWord, and return).

Next, we use the same searching mechanism. We start at the root, and compare the word we are trying to insert to the word at the root, and decide whether to move left or right. We keep doing this until we hit a null pointer.

The only new part of the code is the introduction of the prev reference. What is that? This is simple. If we keep following the nodes of the tree, as we did in the searching example, we will end up with a null pointer, and nothing to do! Which is pretty funny, but totally useless.

Instead, what we actually want to happen is that, when we meet a null pointer, the node found immediately before this node is made to point to the newly created node.

Take a look at the figure below. It will make things clearer.

Fig 9. Nodes visited before inserting a new node

When we try to insert a new element with a key of six, the code will pass through the three highlighted nodes (exactly as the searching code did). It will end up at the right link of the node “5.” At this time, root is null, but prev points to the node “5,” so we change the “right” reference of “5″ to make it point to the newly created node “6.”

Now, trees are great. But, is life really that good? Unfortunately, the answer is no. Sometimes trees cause all sorts of problems. Searching and insertion to a tree is fast and efficient, but only when the tree is nearly balanced. If the tree is not balanced, insertion and searching become unbelievably inefficient.

Unfortunately, the balancing of a tree depends on the order of inserting elements. Let’s imagine you have 7 elements 1,2,3,4,…,7. If you inserted them in that exact order, you will end up with a tree that looks like this…

Fig 10. Trees can behave really badly if you aren’t careful

Ewww.. that was pretty bad, wasn’t it? Now, this looks similar to a linked list, which is not a very good thing.

Is there a way we can prevent trees from skewing this way? Luckily, there is! But explaining this will get complex. I will give an overview near the end. Meanwhile, let’s turn our attention to a new problem. In this case, the data is actually best represented as a tree. This is the situation when you’re dealing with an Internet browser.

{mospagebreak title=Internet Browser}

Consider this scenario. You are fed up with Internet Explorer. You don’t love Opera, and you are not even very happy with Mozilla (you must be really hard to please, eh?). So, you decided to take the brave step of building your own Web browser…and you immediately faced a problem with frames.

The user clicks on the window of your browser. Your browser needs to know which frame this click fell into, so that it could be delegated to the proper frame. This is the problem we will be handling now.

To repeat the problem a bit more formally: given a location on the window (X,Y), we need to know to which frame this click belongs. Pretty simple, right? First, let me give you a quick overview on frames, in case you haven’t met them before.

Frames are used to display multiple views in an HTML document. You can split the document horizontally or vertically, and in each pane you can display a different HTML document. Look at the page below for an example.

Fig 11. An example of a Web page with three frames

Frames can be nested, which results in pretty complex shapes. We can end up with something that looks like this (or even more complex):

Fig 12. More complex frames

Now, again, our problem is, given an HTML document and a mouse location, we need to know to which frame any particular click belongs.

Before solving this problem, we have to stop and ask ourselves an important question. Specifically, how are we going to represent the frames in our program? What kind of data will be needed to hold the information about the frames? Believe it or not, it looks like trees are going to be an ideal choice for this task. Every node in the tree will represent one division (horizontal or vertical). We can imagine that every node in the tree splits one frame into two frames.

If it is still notl very clear, let’s support it with some examples. For example, a frameset like this:

Fig 13a. This easy one…

can be represented by a single node like this:

Fig 13b. …will end up as a single node like this

The letter describes the type of division (horizontal, vertical). The ratio is the ratio of total length at which the division occurs.

Of course, the horizontal example below:

Fig 14a. The horizontal case…

will also be represented by a single node like this:

Fig 14b. …is no different

Note that every node holds the type of division (horizontal or vertical), as well as the ratio at which the division occurs. The left child points to the frame that comes first (if it is a horizontal frame, the left child points to the frame on top, if it is a vertical frame, the left child points to the frame to the left). To avoid marrying ourselves to a certain resolution of window size, we are using normalized coordinates. This means that the divisions are all ratios of the whole width. For example “H”, 0.3. means a horizontal division at 0.3 of the height.

Here’s a more complex example:

Fig 15a. Web designers can sometimes be overly creative!

But it can easily be represented like this…

Fig 15b. The above example represented using a tree

The letter names at the end of the tree are just there for the purpose of illustration. I hope they help clarify the idea behind using the tree. The very first node (“H”, 0.2) represents the very first horizontal frame. The window is split into ”A”, “B” at a side, and the rest of the frames are on the other side.

Now, the beauty of binary trees become apparent. It doesn’t matter how many frames, or how complex they are, they can always be represented using just a single binary tree. If a page has only two frames, it is the same as having 20 levels of nested frames. It can always be represented as a single binary tree, without any additional complexities.

Now, a whole lot of problems arise. How can we parse the frameset tags to construct the tree? How can we delegate the clicks to the appropriate frames? What happens when the user resizes the window?
For the purpose of this article, we are going to pick the problem that illustrates the work of trees best. Namely, given a click, determine which frame the click fell into.

So, how do we solve this problem? This will sound very similar to our dictionary example. We start at the root, and see which side to move to, depending on the click location. With every new level we do the same thing until we reach a node that has no children (leaf node), and we can determine which frame to go to.

Let’s take a look at the frames in our previous example, figure 15 above. Now, assume the user clicked at point (0.8, 0.7). Remember, to avoid problems with screen size, we assume all numbers are ratios of window width and height. So, (0.8,0.7) means the user clicked at a point that is 0.8 from the window width, and 0.7 from the window height.

First, starting at the root, we find that the point 0.8, 0.7 must lie near the bottom. This is because the first frame dictates that the screen is split horizontally at 0.4. Which means the y coordinate of the point 0.7 must be at the bottom.

Next, at this point, we move to the node  “V”, 0.3 (the right child). We know the point must be at either frame C, D or E (we eliminated “A” and “B” after 1 comparison).

Third, we compare at the new node, and because this is a vertical split, we compare the x of the point we have (which is 0.8), to the point of the split (which is 0.3). Now we know we have to move to the right, because 0.8 is larger than 0.3. We know we are in a frame that is to the right.

So, we move to the third node “H”, 0.8. Because this is a horizontal split, we compare the y value of the point (0.7) to the y value of the split (0.8), and we know we have to move up (take the left child), and we end up at the D frame, which is correct.

I hope this makes sense. Try to trace for yourself what happens if the user clicks at point (0.9, 0.1). Can you get it? Did you end up at frame “B”? Great, let’s move on…

Now comes the good part, how do we write the code for this? Believe it or not, it is extremely easy. Take a look.

class frameNode
{
boolean isHorizontal;
float ratio;
frameNode Left;
frameNode Right;

}

class FrameSet
{
frameNode FrameTree;
void findFrame(float x,float y)
{
frameNode frame=FrameTree;
frameNode prev=null;
while(frame!=null)
{
prev=frame;
if(frame.isHorizontal)
{
if(frame.ratio<y)
frame= frame.Left;
else
frame = frame.Right;
}
else
{
if(frame.ratio<x)
frame = frame.Left;
else
frame = frame.Right;
}
}
//here, prev is the node splitting the last frame to 2.
//One of them is the frame we want.
}
}

The code simply checks every node to see if it is horizontal or vertical, and decides whether to check the x or y cords of the click point. Note, by the way, that we didn’t actually use the information we obtained. We just found the frame we wanted, and then did nothing. This is, of course, because it is just an example. If this was a real life application, some modifications would have to be made. For instance, we could put in an extra level of nodes to contain real information about the frame.

And that was it, our second example with trees. I hope you didn’t find it too complex.

You have now seen two applications for trees. I sincerely hope I have succeeded in arousing your interest in using trees.

{mospagebreak title=Conclusion}

Though trees seem complex at first, they are fairly powerful data structures. We have looked at two representations for trees, the array, and the linked structure. We have also covered some examples of how to use trees. The concept of a binary search tree was introduced as an alternative to storing data in an array or a linked list. Overall, I hope I could get you interested in this amazing data structure.

So, what’s next? If you are still looking for more, you are in luck, because there are plenty of things you can do with trees, and plenty of material for you to read about the subject. Do you remember our first example, where we showed that, if the tree gets skewed, everything gets screwed? There are many solutions to this problem; the nicest is a special kind of binary search tree called an “AVL tree.” This kind of tree rotates if it gets imbalanced, to force the tree to balance itself!

There are other kinds of trees, like BSP trees, which are similar to the trees we used for our second example. They are more complex, however, and used in 3D programs. For example, they were used in the games Doom and Quake.

[1] We didn’t mention how we can delete elements from a binary search tree. Point your browser here for more details: http://www.cs.oberlin.edu/classes/dragn/labs/avl/avl4.html. This site also explains how to do range searching in a binary search tree.

[2] As promised, here is a site about AVL trees: http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/AVL.html

[3] If you are curious about how any general tree can be represented as a binary tree, look over here:  http://www.nist.gov/dads/HTML/binaryTreeRepofTree.html