Solving Problems with Recursion

Recursion is a way to solve a problem by…reducing it to the same problem. What? It may be counterintuitive, but many turn-based games (including chess) use exactly this technique to make a computer player “think.” Mohamed Saad explains the concept, along with when (and when not) to use recursion in your programming. Check out the Connect4 example!

Recursion: Solving problems the recursive way

In this article, I will introduce the concept of recursion, one of the most amazing programming constructs, as you will see. Recursion is quite counter-intuitive, and you will certainly find it weird at first, but once grasped, it becomes a seriously powerful weapon in your programming arsenal. Let’s start with the definition.

What is Recursion?

Recursion is a totally different method of solving problems. Conventional problem solving methods consists of decomposing the solution into steps, and executing each step in turn. Well…enter recursion. The recursive way of solving a problem is to reduce the problem into another problem that is exactly of the same type, and solving the new one instead. And how do you solve the new one? Guess what? You just reduce it into another problem of exactly the same type, and so on, and so on, and so on…

Ok, before it gets too weird, let’s see an example of a real-life problem, and how we can use recursion to solve it.

Real Life Example

Let’s assume you have moved to a new apartment. You are exploring the neighborhood, and you suddenly discover that you don’t know your house number. You looked at the building, didn’t find a number there. You knocked on a random apartment, and asked, “what is the number of this building,” and the guy who opened the door smiled and said, “I don’t know the number of this building, but if you look at the number of the building next to us and add 1, you will know our building number.”

Now, let’s pause for a moment. The problem you are trying to solve is “knowing the number of your house.” Let’s look at the proposed solution to this problem. What was the suggestion? You were asked to look at the number of the building next to you and add 1. Your problem was thus reduced into a problem of exactly the same nature! At first your problem was “to find the number of a house,” and your problem now became “finding the number of a house.”

Solving Problems with Recursion

Fig 1. You never knew knowing your house number could get you into so much trouble, did you?

{mospagebreak title=The Plot Thickens}

To confuse things even more, let’s say you visited the house next to you, but you still found no number. You knocked on the door, and asked about the house number. Guess what they said to you? “We don’t know, but look at the house next to us, add 1, and you will know our house number.”

Déjà vu anyone? This game could take a while. You should repeat this again and again, till you find a house with a number you know. You add 1, and know the number of the one next to it, and so on, until you reach your house.

Notice that in every step, your problem was always reduced to exactly the same kind of problem (knowing a house number). This is the heart of recursion. We have just made our first recursive solution to a problem.

Before we go any further, I want to make two very important notes. These notes are going to be extremely useful when we start dealing with recursion in programming.

Note 1: In every step of solving a recursive problem, the problem always reduces to a problem of exactly the same nature.

This one is obvious, but now look at number 2.

Note 2: At a certain point, you should just solve the problem you have immediately rather than reducing it any further.

This note is extremely important. Let’s return back to our house numbers problem. If you just keep asking and you are always asked to see the next house, you will spend your entire lifetime looking at buildings! At a certain point, you have to find a house with a number that you know. At this point, the recursion stops, and you start to solve all the problems you left open.

Let me stress this again: you can’t keep reducing your problem into a similar problem forever, or else you are never going to stop. At a certain point, you have to just stop and solve the problem at hand. This point is sometimes called the stopping condition.

Now, I can already hear you screaming, “What does this have to do with programming?!” Well, a lot, actually. When you are writing a program, you are solving a problem (or a set of problems). If you write your solution to the problem in a recursive way, what will it look like? Basically the function you write to solve the problem is going to eventually call itself. What for? This comes from our definition of recursion. We reduce a problem to a problem of exactly the same nature. This is why the function calls itself to solve the new instance of the problem…

Sounds complicated? Don’t worry. Let’s look at our first programming example, a really simple one.

{mospagebreak title=First Programming Example}

As promised, this one is going to be easy. We will take a well known problem, and see how we can solve it recursively. The problem is searching an array. You are given an array, and you want to search for a certain value inside the array. The non-recursive approach to this problem would be to iterate through the array looking for the value we search for. When it is found, we stop and report success, and if not, we report failure.

Now, let’s think recursively. We want to reduce the problem of searching an array into a problem of…well…searching an array. After all, this is how recursion works.

Think of this approach. To search an array whose length is n for the value of v, do the following.

If the element is in the very last position, report as success

If not, search the whole array (except for the last position), for the element we are looking for.

You see, we actually managed to reduce the problem from searching an array into searching an array. Are we forgetting anything? Of course we are: the stopping condition! When should this stop? Luckily this is simple–when you are faced with an empty array, report failure.

Now look at the code to actually implement this…

class List
{
 int[] data=new int[100];
 public boolean search(int v, int n)
 {//v is the value we search for, n is length of array to search
  if(n==0) return false;
  if(data[n-1]==v) return true;
  return search(v,n-1);
 }
}

Take a closer look at this code. I have removed everything, except for the relevant parts. Did you notice that there aren’t any loops at all? That’s right, we made a search over all the elements of the array without having any kind of loops in the code. Just a couple of if statements, and a return statement! How does it work?

It is a little tricky. First you call search(v,100). This function checks the last element of the array, and if that’s the one we want, it returns true. If not, it calls search(v,99). The newly called function checks the position before the last one, and if the element is not found there it calls search(v,98), and so on. If the element is not in the array, the function search() will keep calling itself till it reaches search(v,0), which will return false. If the element is found, the function that finds it will return true.

If you think this is complicated, don’t worry. Recursion looks complicated the first time you see it. I mentioned that recursion is quite counter-intuitive. The human brain can never solve problems recursively. Our brains can only solve problems by thinking of the steps necessary, and doing them one by one. Only computers are comfortable with recursion. Anyway, let’s get back to our discussion.

I can hear the skeptics complaining, “I can’t see any benefits from using recursion. We solved a simple problem by writing a really tricky piece of code. We could have just lived with a small loop to do the search, and ignore recursion completely.” Patience people; I can’t say everything at the same time! Let’s head into our second example.

{mospagebreak title=The Flood Fill Algorithm}

Now, let’s turn our attention to a totally different problem. We are working on a 2D drawing package, and we want to make a bucket fill tool similar to that found in Microsoft Paint. The user selects a color, clicks on a point, and the color spreads to fill the area with this color. Have a look at this figure.

Solving Problems with Recursion

Fig 2. We need to write a program to flood fill a shape

We want to write a function to implement this feature. We are given a 2 dimensional array of colors, a point to start painting from, and a color. We should fill the fill area with the color as mentioned above.

There is no easy non-recursive solution to this problem, but we can solve this recursively in just 8 lines of code. How? The idea is simply to make a flood fill from a point do the following:

  • If the point we want to color is already filled, don’t do anything and stop, else color it.
  • If the point above it is not colored, call flood fill from that point (recursive step).
  • If the point to the right is not colored, call flood fill from that point (recursive step).
  • If the point to the left is not colored, call flood fill from that point (recursive step).
  • If the point below it is not colored, call flood fill from that point (recursive step).

Now, you see, the whole solution is just 5 steps, and 4 of them are recursive calls. It is important to understand how this actually works.

class Canvas
{
 byte board[][]=new byte[100][100];
 public void Fill(int x,int y,byte c)
 {
  if(board[x][y]==c) return;
  board[x][y]=c;
  if(x<99&&board[x+1][y]!=c)
   Fill(x+1,y,c);
  if(x>0&&board[x-1][y]!=c)
   Fill(x-1,y,c);
  if(y<99&&board[x][y+1]!=c)
   Fill(x,y+1,c);
  if(y>0&&board[x][y-1]!=c)
   Fill(x,y-1,c);
 }
}

I have tried to make the code simpler by fixing width and height. But, of course, it is easy to generalize the code to different sizes and color formats.

{mospagebreak title=How it Works}

How does this code work? Looks too small to be true? Well, it isn’t. It works like this: you start at a point, if it is already colored you do nothing. This condition prevents color from flowing out.

Next, you turn your attention to the 4 neighboring pixels. If any one of them is not colored, start a flood fill from that point too. And guess what, when you start a flood fill from each of those points, they will start to fill the neighboring pixels, and so on, and so on… This is the magic of recursion.

Take a look at this figure. It shows the first 3 steps of how flood fill works. This figure is not actually 100% accurate in terms of the sequence of pixels being filled, but I hope it illustrates the idea.

Solving Problems with Recursion

Fig 3. Starting at the red point, the program fills the point, and then calls itself to fill the 4 surrounding points, and then from each of the surrounding point, calls itself to fill the 4 surrounding each one of them and so on (Note: the function calls will not occur concurrently as shown here. This is just for illustration purpose. In reality. They will be called one after the other. The sequence of coloring pixels will be different)

Now you see how it works. We have created a really nice flood fill algorithm that can fill really complicated shapes with a minimum amount of effort.

I’ll let you in on a little secret. The code we have just written is bad in its memory usage. Really bad. Why? The same function is called again for every pixel. Each function call takes a specified amount of memory that is released when the function is finished. And so, by the time we reach the last pixel to fill, there are lots of instances of the flood fill function all in the memory, taking lots of space. For a large shape, this memory could be in the megabytes range. Not something to be ignored if you are working on a machine with limited memory (such as a mobile phone). 

The moral is to be careful when dealing with recursion. There is a secret overhead because of the function calls, which should never be ignored. 

Coming up next is a major example, and one of the best uses of  recursion. It is harder than the previous examples, but it’s such a good example that I feel obligated to include it. Grab a cup of coffee; it should help. I’d treat you if I could…but let’s not digress. Instead, let’s go directly to our final example. 

{mospagebreak title=Connect 4 AI program}

Our last example is creating an AI routine for a connect 4 program. Let’s assume you are creating a connect 4 game, and you want to write the routine for the computer player. How do you make the computer think? This is our example.

By the way, the same idea can be used to make an AI routine for nearly any turn based game, be it Tic-Tac-Toe, Connect 4, or even Chess. We will see in a moment what changes need to be made, and why the harder games require more work.

First, for those people who don’t know what Connect 4 is, it involves dropping pieces into a board to form a line of of 4 in a row, column or a diagonal. You can find an applet that plays Connect 4 at http://www.bodo.com/Applets/Connect4. I hope this gives you an idea of the game.

What is a good way to make the computer play this game? Here’s an idea. Computers are good at enumerating choices, so let’s make them do just that. Here is how we will make it work.

First, let’s consider this solution. The computer will try to put one of its pieces in each different location. Whatever makes the computer win, it will take it. What do you think of this solution? It is not very good. Why? Because the computer is not planning anything ahead. If the computer is only looking one move ahead, it is just like a man fighting in the dark, a man who can only see his footsteps. He will never be able to beat his opponent. Especially if his opponent can see and plan his moves very carefully.

Here is a better alternative. The computer will try to play a piece in each possible location. For each location, it will also try to put a piece for the opponent in each possible location, and then evaluate the position. It will choose the best position. Now, this is a bit better. The computer is now looking one move ahead.

But why stop here? We can extend this idea like this. The computer will try to place a piece in each possible location. For each location, it will try to put a piece for the opponent, and for each location it will try to place a piece for itself, and so on.

In other words the computer is trying all possible combinations for several moves ahead, and then it chooses the best move.

How should the computer decide the best move? It should assume that both sides are playing their best moves. So, when it tries its own moves, it should take the best move for itself, and when it considers the opponent’s move, it should take the worst move for itself. In other words, the best move for the opponent.

Getting complicated? Don’t worry, let me explain it in more detail. When I am trying my moves, if I find a really good move, then I certainly will take it. When I am enumerating my opponent’s moves, what if I find a really good move for him? He will certainly take it. It is that simple. When you are considering moves, you should take the best move for the side you are trying moves for.

{mospagebreak title=The Minimax Method}

What we have just created is called the minimax method. Let’s see roughly how our code should look. I will try to keep things as simple as possible, so as not to distract you from the main idea. I will explain how to improve on it later. 

class MoveValue
{//holds a score, column pair.
 int Score;
 int col;
 MoveValue(int s,int c)
 {
  Score=s;
  col=c;
 }
 MoveValue()
 {
  Score=col=0;
 }
}

class Connect4
{
 byte board[][]=new byte[8][7];
 //the connect 4 board. 0 means empty, 1 means player 1 piece
 //2 means player 2 piece

 MoveValue bestMove(byte sideToMove,int depth,int maxDepth)
 {//Finsd the best move for the side. Returns column and score pair
  //depth specifies current level of depth (should be passed as 0)
  //maxDepth specifies the maximum depth to search
 
   //holds the best score
   MoveValue best_score = new MoveValue(-100000,0);
    if(depth%2==1)best_score.Score*=-1;
  
   MoveValue mv=new MoveValue();
   for (int i=0;i<7;i++)
   { //try putting a piece at each of 7 columns.
  mv.col=i;
  if(board[0][i]!=0)
   continue; //if column is full ignore it.
   
  //find the row to put this piece. x is the row
  int x=0;
  while(x<8&&board[x][i]==0)x++;
  board[x-1][i]=sideToMove;

 
  //find if it is a game over, don’t search further if it is
  //a score of 100 or -100 means a game over
  MoveValue cur=new MoveValue();
  cur.Score=value();

      if(cur.Score==100||cur.Score==-100||depth==maxDepth)
       mv.Score=cur.Score;
  else //search further (this one is the recursive call)
    mv.Score=bestMove((byte)(3-sideToMove),depth+1,maxDepth).Score;
  //3-sideToMove, gives you the number of the opponent. If sideToMove is 1
  //3-sideToMove will be 2. If sideToMove is 2, 3-sideToMove is 1

  //if this move is better, use it instead
  if(mv.Score>best_score.Score&&depth%2==0)
   best_score=mv;
  else if(mv.Score<best_score.Score&&depth%2==1)
   best_score=mv;

  //undo the move. Remove the piece you just put
  x=0;
  while(board[x][i]==0)x++;
  board[x][i]=0;
   }
   return best_score;
 }
}

This is the most complex piece of code we have written today. What does it do? Let me explain.

The code starts be defining a new class MoveValue, just to hold the value of a move, plus the column to play at. For example, if we discover that the best move is at column 3, which will give you a benefit of 90, the pair should hold the value (90,3).

Now, let’s turn our attention to the code itself.

The basic idea is simple. Each move is assigned a score based on how good this move is for the player. We choose the best move for the player by choosing the move with the highest score.

The code first declares a MoveValue variable called best_score to hold the best move it can find.

If depth is odd it initializes the score to a very small negative number, or else it initializes it to a big positive number. Huh? Why? Remember when we said the computer, when it tries its moves, should take the best move for itself, and when it considers the opponent’s move, it should take the worst move for itself. In other words, the best move for its opponent.

This is the reason behind those initial values. When depth is even, we are considering a move for the computer, so we initialize the score to a very small value so that the first possible move will be better than it. And when depth is odd, we initialize it to a very big number, so that any possible move will be worse than it.

{mospagebreak title=Placing Pieces}

Now what? We make a loop over all 7 columns and try to put a piece in each place. We try to put a piece in column #i if it has an empty slot.

After trying to put a piece we call a function called value() to tell us about the value of this position. The function value(), which we are going to cover in a moment, is given the current situation, and gives us a score that tells us how well this position is for the computer–higher is better, of course.

We call this function first to check for a game over. If it returns +100 the computer has won, or if it returns -100 the human player has won (this is how we are going to program it). In both cases, we don’t need to continue trying further moves. The game is over.

So, we check if the return from this function was +100 or -100 or if we have reached the end of our analysis (depth is equal to maxDepth) and we just take the value returned from the value() function.

If not, we recursively call the bestMove() function to continue digging deeper into the tree of possibilities.

In both cases, we check the return from the function against the best score found so far. If the new score is better than the best score found so far (i.e. it is larger than the best score if it is the computer’s turn, or smaller than the best score if it is the human’s turn), we replace the best score with the new score.

Finally, we return the board to what it was by removing the piece we just put. We repeat again for the next column and so on…

The Value Function

Now, what about that value function? This function should tell us how good a position is from the point of view of the computer player. For the sake of simplicity, we will make a simple value function that just checks who is winning. Here it is…

int value()
{//get the value of a position

 //check for 4 in a row
 for(int i=0;i<5;i++)
  for(int j=0;j<7;j++)
   if(board[i][j]!=0&&board[i][j]==board[i+1][j]&&
   board[i][j]==board[i+2][j]&&board[i][j]==board[i+3][j])
    if(board[i][j]==1)
     return 100;
    else return -100;
    
 //check for 4 in a column
 for(int i=0;i<8;i++)
  for(int j=0;j<4;j++)
   if(board[i][j]!=0&&board[i][j]==board[i][j+1]&&
   board[i][j]==board[i][j+2]&&board[i][j]==board[i][j+3])
    if(board[i][j]==1)
     return 100;
    else return -100;
    
 //check for 4 in a diagonal
 for(int i=0;i<5;i++)
  for(int j=0;j<4;j++)
   if(board[i][j]!=0&&board[i][j]==board[i+1][j+1]&&
   board[i][j]==board[i+2][j+2]&&board[i][j]==board[i+3][j+3])
    if(board[i][j]==1)
     return 100;
    else return -100;
   
 //check for 4 in the reverse diagonal
 for(int i=3;i<8;i++)
  for(int j=0;j<4;j++)
   if(board[i][j]!=0&&board[i][j]==board[i-1][j+1]&&
   board[i][j]==board[i-2][j+2]&&board[i][j]==board[i-3][j+3])
    if(board[i][j]==1)
     return 100;
    else return -100;
 return 0;
}

As you see, the function just checks if there are 4 pieces in a row, column or diagonal, and if it is the computer’s pieces, it returns +100, and if it’s the human’s pieces it returns -100.

{mospagebreak title=Trying it Out}

Whew, and that’s it. Now we have a nice Connect 4 AI engine that can actually look several moves ahead, thanks to recursion. For example, if we try the engine on this position:

Solving Problems with Recursion

Fig 4. The AI (red player), will know how to win in this situation

If the program is called to play for the red player with a depth of only 3, it will correctly find that the best move is to put a piece in column 2 to force the human to block at column 5 and then the computer wins by putting a piece at column 5.

In another situation: 

Solving Problems with Recursion

Fig 5. If the computer (yellow player) could only see one step ahead, he would be unable to see the threat of playing at column 3 or 6 and winning. Now, it can detect and prevent it.

The computer correctly decides to play at column 3 to block the human from winning by putting at column 3 or 6.

Pretty impressive, huh? And it is all less than 30 lines of code.

{mospagebreak title=Two More Tips}

So, is this the end of story? Certainly not! There is room for improvement, but that’s really beyond the scope of this article. I will just give you 2 pointers. First, the value() function could be greatly enhanced, so that it gives a better scoring system. For example it should give some score for 3 in a row (because that’s close to getting 4 in a row). If further studies are made about the game, we can even create a much better value function that takes into account the position of pieces relative to each other, and to the edges and so on…

On the other hand, we can also improve on the bestMove() function itself, by using the alpha beta pruning technique. This technique makes minimax run a lot faster, and I mean a lot. Sadly, this is also outside the scope of this article, but you will find references to alpha beta pruning at the end. Maybe this should be the topic of a future article. Drop me an email if you are interested.

One last comment. The routine we made ignores the case when the board gets full and the game ends in a draw. I didn’t want to complicate things, but I am sure you can easily add it to the current code without much trouble.

So, that was it. Our last example for recursion. But this not the end of the journey. The journey has just started. Hopefully you are ready to start writing your own recursive code now.

When to Use (or Avoid) Recursion

As we saw with the flood fill program, sometimes recursion wastes a lot of memory space (or execution time). Other times, recursion is just the way to go. How do you decide?

  1. If the problem definition fits well with a recursive solution, this is a good sign to use recursion, but see number 2.

  2. Before writing a recursive solution, think of memory and processing time. If you can find a non-recursive solution the does the job faster, or by using less memory, don’t use recursion (unless you know what you’re doing).

  3. Recursion can be overused, resulting in some very clumsy code. Don’t overuse recursion, or your code will be unreadable.

  4. If the recursive solution is really small and elegant, this is a good sign you should go for recursion.

  5. Tail recursion (a recursive function whose last line is a recursive call) is optimized well by the compiler. It will not waste memory as you would think.

Finally a bit of theory, for theory fans. Every recursive program can be re-written in a non-recursive way. Period. But this simple statement doesn’t indicate how complicated the non-recursive code can be. Sometimes a 3 liner recursive code can become truly monstrous code when you switch into the non-recursive version. It is always your decision.

Your feedback is always welcome. Send me an email (msaad@themagicseal.com), and tell me what you think. I certainly would love to hear from you.

Further Reading

Google+ Comments

Google+ Comments