For many programmers, bits aren’t an issue that receives any serious attention. Whenever someone mentions a bit, they get a vague idea of ones and zeros strung together, deep within the heart of a computer, at a level they don’t have to work with or even think about. Creating a functioning program, or even a complex program, requires absolutely no knowledge of bits, and as a result, learning about bits is often a curiosity. Programmers can manipulate more complex and more abstract data structures, and these structures form the real meat of the program. With Perl and other similar scripting languages, this is especially true.

But bits are definitely worth learning and thinking about. Sometimes, a particular algorithm requires manipulating bits to work efficiently or at all, and other times, working with bits just makes problem solving a heck of a lot easier. Even in a language like Perl, this is still the case. In any case, it’s good to have an understanding of data at a lower, more basic level, and learning about bits will provide just that.

In this article, we’re going to take a look at working with bits and Perl. We’ll look at the bitwise operators that Perl provides, along with some Perl functions that make bit manipulation easier. We’ll also take a look at an application that requires bit manipulation in order to function.

{mospagebreak title=A bit of a need}

In the introduction, I made several claims about how helpful a knowledge of bits is, yet I offered no proof. After all, you may have been programming for quite some time without even giving a passing thought to working at the bit level. Even after reading this article, you may struggle to find applications for this knowledge. But they’re there, I promise.

Let’s start by looking at an algorithm for finding prime numbers, called the Sieve of Eratosthenes. Prime numbers may not be practical for most people, but I find them very interesting, and I think this example will illustrate some very important points. It’s a very simple example demonstrating the need for bits.

So, what’s the algorithm look like? Basically, you start with a list of numbers, from 2 (the lowest prime number) to whatever the maximum number you want to check for primality. Let’s say we want to go from 2 to 15. The list would look like this:

2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15

The algorithm involves crossing out numbers that aren’t prime, but this is done in a systematic way. We start with the first number not crossed out, which is 2. Then, we cross out all of its multiples:

2, 3, ~~4~~, 5, ** 6**, 7,

**, 9,**

**8****, 11,**

**10****, 13,**

**12****, 15**

**14**We’ve made the first cut. Next, we have to find the next number that is not crossed out, which is 3. We cross out all of its multiples, just as we did with 2:

2, 3, ~~4~~, 5, ** 6**, 7,

**,**

**8****,**

**9****, 11,**

**10****, 13,**

**12****,**

**14**

**15**We repeat this for every number less than the square root of the maximum number to be tested, and the remaining numbers are prime. In this case, the square root of 15 is just under 4, so we’re done. All of the remaining numbers are prime:

2, 3, 5, 7, 11, 13

That’s not very difficult at all. So, how do we represent this in Perl? It’s very simple. All we need is a list of numbers on which to operate. A naïve solution is to create an array where each element is a number to be checked. So, we’d start with something like this:

**my**` ``@numbers`` ``=`` ``(``2``,`` ``3``,`` ``4``,`` ``5``,`` ``6``,`` ``7``,`` ``8``,`` ``9``,`` ``10``,`` ``11``,`` ``12``,`` ``13``,`` ``14``,`` ``15``);`

Then, we could remove each non-prime number from the array as we encounter it, or make it zero or something. This works, but for larger arrays, this is extremely slow, and it’s a waste of memory.

Another solution would be to make an array with values of 0 or 1 and use the indexes as the numbers. For example, if $numbers[2] is 1, then the number is prime. This is headed in the right direction, but it’s really not any better than the first solution. Even if all you store in the elements are 0 and 1, Perl is still going to allocate more memory than you actually need. After all, you’re still working with integers, and integers have quite a range, even if you don’t take advantage of that range.

So, how do we implement this algorithm? By using bits, of course! We’ll come back to this example at the end. Try to think of the exact solution as you read through this article.

{mospagebreak title=Representing binary numbers}

In order to work with bits, we need to be able to represent them in Perl. Perl makes representing binary numbers very easy, just as it makes representing hexadecimal numbers, which are perhaps more commonly used, easy. Suppose that we want to represent a single bit, turned on (1). We’d do it like this:

**my**` ``$one`` ``=`` 0b1``;`

Obviously, this represents the number 1 in base-10. We can represent the numbers 2 and 3 similarly:

**my**` ``$two`` ``=`` 0b10``;`

**my**` ``$three`` ``=`` 0b11``;`

This is confirmed by printing them:

**print**` ``"$onen$twon$threen"``;`

123

Notice how Perl treats the numbers as base-10 when printing them out. This can be changed:

**printf**` ``"%bn%bn%bn"``,`` ``$one``,`` ``$two``,`` ``$three``;`

1

10

11

This makes much more sense.

So now you know how to represent a number in binary with Perl. However, that alone doesn’t do anything for us. If all you want to do is represent a number, then you’re better off just writing the number in base 10, which would certainly save a lot of time.

The real value in thinking in binary is that for each number, what you really have is a string of bits, which can be in one of two states (1 or 0, of course). This is useful for multiple reasons. You might have a string of related boolean values that you need to easily represent. In this case, you’re able to pack a lot of values into a very small amount of memory.

Or, you might need to represent some other type of value which the current data types do not suit. For example, you might need to represent something that has four states. This can be done in two bits.

Or, you might need to represent something really big, in which case you need a large number of bits. Using bits, you’re only bound by the amount of available memory. Either way, the solution can be found at the bit level.

So, how can we manipulate bits, and how can we determine the value of individual bits? This is done with bitwise operators. Perl offers the typical bitwise operators present in C-style languages. Let’s take a look at them now.

{mospagebreak title=Operating on bits using bitwise operators}

Let’s start by looking at the bitwise AND operator, represented by the ampersand symbol (&). This operator takes two operands and compares corresponding bits. If both of the bits are 1, then the result is 1. If not, then the result is 0. For example, if we used these two numbers as operands:

1000

1001

The result would be this:

1000

Only the first bit in each number is 1, so in the result, only the first bit will be 1. This is how the operation would look in Perl:

**printf**` ``"%bn"``,`` 0b1000 ``&`` 0b1001``;`

The above line simply prints out the result.

The bitwise AND operator is very useful for picking out the values of certain bits. You can apply a *bit mask* to a number to determine the value of a particular bit. For example, consider this binary number:

1011

Say that we need to determine whether the first bit is 1 or 0. We can do this by using the bitwise AND operator and another binary number, in which only the first bit (the bit we’re looking at) is set to 1:

1000

The operation yields the following result:

1000

If the first bit had not been 1, then the result would have been zero. So, in order to check if the first bit is 1, we simply need to check if the result is nonzero. The following code snippet demonstrates this:

`# A nybble (or nibble) is four bits`

`# (Half of a byte)`

**my**` ``$nybble`` ``=`` 0b1011``;`

**if**` ``(``$nybble`` ``&`` 0b1000``)`` ``{`

` `**print**` ``"The first bit is 1.n"``;`

`}`` `**else**` ``{`

` `**print**` ``"The first bit is 0.n"``;`

`}`

Next is the bitwise OR operator, represented by the pipe symbol (|). You can probably guess the functionality of this operator. It takes two operands and examines corresponding bits, just as with the bitwise AND operator. If one *or both* of the bits is 1, then the corresponding bit in the result will also be 1.

So, if we took these two operands:

1010

0110

And applied the bitwise OR operator, then we would get this result:

1110

This is confirmed by the following line of Perl code, which performs the above operation and then prints out the result:

**printf**` ``"%bn"``,`` 0b1010 ``|`` 0b0110``;`

The bitwise OR operator is useful when working with flags. A flag is simply a bit or a set of multiple bits that indicate some sort of behavior. For example, say you’re developing a library that manipulates text, and you need to give developers the option to specify whether text will be italicized, bold, or both.

You could create a function that takes an argument for each of these properties. Or, you could use bit flags to condense it into a single argument. This would involve using one bit to determine whether text should be italicized, and another bit to determine whether text should be bold. This only requires two bits. We’ll assume that the first bit is for italics, and the second is for bold text. So, to italicize text, this binary number would be passed into the function:

10

To make text bold, this binary number would be passed into the function:

01

Finally, to make text bold and italicized, this binary number would be passed in:

11

In order to make this work, you’d want to create two constants, one with the bit set for italics, and one with the bit set for bold text:

**use**` constant ``{`

`ITALICS => 0b10,`

`BOLD => 0b01,`

`}``;`

In order to call a formatting function, specifying that we want italicized text, we’d do this:

**doSomething**`(``ITALICS``);`

In order to specify bold text, we’d call the function like this:

**doSomething**`(``BOLD``);`

Finally, in order to specify that we want both formatting options, we’d use the bitwise OR operator like this:

**doSomething**`(``ITALICS ``|`` BOLD``);`

We can check the values using the bitwise AND operator. Let’s make the doSomething subroutine print out messages indicating what formatting options have been selected:

**sub**` doSomething ``{`

` `**my**` ``$formatting`` ``=`` `**shift**`;`

` `**print**` ``"Italicized.n"`` `**if**` ``(``$formatting`` ``&`` ITALICS``);`

` `**print**` ``"Bold.n"`` `**if**` ``(``$formatting`` ``&`` BOLD``);`

`}`

Neat, huh? You’ll see this done often in the programming world.

The bitwise AND and OR operators are very important in working with bits in any language, but there a few more very important operators available. In the next article, we’ll take a look at them.