Basic Data Types and Calculations - Try It Out: Using Integer Variables (
Page 5 of 14 )
Here’s a program that figures out how your apples can be divided equally among a group of children:
// Program 2.2 - Working with integer variables
#include <iostream> // For output to the screen
using std::cout;
using std::endl;
int main() {
int apples = 10; // Definition for the variable apples
int children = 3; // Definition for the variable children
// Calculate fruit per child
cout << endl // Start on a new line
<< "Each child gets " // Output some text
<< apples/children // Output number of apples per child
<< " fruit."; // Output some more text
// Calculate number left over
cout << endl // Start on a new line
<< "We have " // Output some text
<< apples % children // Output apples left over
<< " left over."; // Output some more text
cout << endl;
return 0; // End the program
}
I’ve been very liberal with the comments here, just to make it clear what’s going on in each statement. You wouldn’t normally put such self-evident information in the comments. This program produces the following output:
Each child gets 3 fruit.
We have 1 left over.
HOW IT WORKS
This example is unlikely to overtax your brain cells. The first two statements in
main()
define the variables
apples
and
children
:
int apples = 10; // Definition for the variable apple
s
int children = 3; // Definition for the variable children
The variable
apples
is initialized to 10, and
children
is initialized to 3. Had you wanted, you could have defined both variables in a single statement, for example:
int apples = 10, children = 3;
This statement declares both
apples
and
children
to be of type
int
and initializes them as before. A comma is used to separate the variables that you’re declaring, and the whole thing ends with a semicolon. Of course, it isn’t so easy to add explanatory comments here as there’s less space, but you could split the state
ment over two lines:
int apples = 10, // Definition for the variable apple
s
children = 3; // Definition for the variable children
A comma still separates the two variables, and now you have space for the comments at the end of each line. You can declare as many variables as you want in a single statement, and you can spread the statement over as many lines as you see fit. However, it's considered good style to stick to one declaration per statement.
The next statement calculates how many apples each child gets when the apples are divided up and outputs the result:
cout << endl // Start on a new line
<< "Each child gets " // Output some text
<< apples/children // Output number of apples per child
<< " fruit."; // Output some more text
Notice that the four lines here make up a single statement, and that you put comments on each line that are therefore effectively in the middle of the state
ment. The arithmetic expression uses the division operator to obtain the number of apples that each child gets. This expression just involves the two variables that you’ve defined, but in general you can mix variables and literals in an expression in any way that you want.
The next statement calculates and outputs the number of apples that are left over:
cout << endl // Start on a new line
<< "We have " // Output some text
<< apples % children // Output number of apples per child
<< " left over."; // Output some more text
Here, you use the modulus operator to calculate the remainder, and the result is output between the text strings in a single output statement. If you wanted, you could have generated all of the output with a single statement. Alternatively, you could equally well have output each string and data value in a separate statement.
In this example, you used the
int
type for your variables, but there are other kinds of integer variables.
Integer Variable Types
The type of an integer variable will determine how much memory is allocated for it and, consequently, the range of values that you can store in it. Table 2-3 describes the four basic types of integer variables.
Table 2-3. Basic Integer Variable Types
Type Name Typical Memory per Variable
char 1 byte
short int 2 bytes
int 4 bytes
long int 8 bytes
Apart from type
char
, which is always 1 byte, there are no standard amounts of memory for storing integer variables of the other three types in Table 2-3. The only thing required by the C++ standard is that each type in the sequence must occupy at least as much memory as its predecessor. I’ve shown the memory for the types on my system, and this is a common arrangement. The type
short int
is usually written in its abbreviated form,
short
, and the type
long int
is usually written simply as
long
. These abbreviations correspond to the original C type names, so they’re universally accepted by C++ compilers. At first sight,
char
might seem an odd name for an integer type, but because its primary use is to store an integer code that represents a character, it does make sense.
You’ve already seen how to declare a variable of type
int
, and you declare variables of type
short int
and type
long int
in exactly the same way. For example, you could define and initialize a variable called
bean_count
, of type
short int
, with the following statement:
short int bean_count = 5;
As I said, you could also write this as follows:
int bean_count = 5;
Similarly, you can declare a variable of type
long int
with this statement:
long int earth_diameter = 12756000L; // Diameter in meters
Notice that I appended an
L
to the initializing value, which indicates that it’s an integer literal of type
long int
. If you don’t put the
L
here, it won’t cause a problem. The compiler will automatically arrange for the value to be converted from type
int
to type
long int
. However, it’s good programming practice to make the types of your initializing values consistent with the types of your variables.
Signed and Unsigned Integer Types
Variables of type
short int
, type
int
, and type
long int
can store negative and positive values, so they’re implicitly signed integer types. If you want to be explicit about it, you can also write these types as
signed short int
,
signed int
, and
signed long int
, respectively. However, they’re most commonly written without using the
signed
keyword.
You may see just the keyword
signed
written by itself as a type, which means
signed int
. However, you don’t see this very often probably because
int
is fewer characters to type! Occasionally you’ll see the type
unsigned int
written simply as
unsigned
. Both of these abbreviations originate in C. My personal preference is to always specify the underlying type when using the keywords
signed
or
unsigned
, as then there’s no question about what is meant.
An
unsigned
integer variable can only store positive values, and you won’t be surprised to learn that the type names for three such types are
unsigned short int
,
unsigned int
, and
unsigned long int
. These types are useful when you know you’re only going to be dealing with positive values, but they’re more frequently used to store values that are viewed as bit patterns rather than numbers. You’ll see more about this in Chapter 3, when you look at the bitwise operators that you use to manipulate individual bits in a variable.
You need a way of differentiating unsigned integer literals from signed integer literals, if only because 65535 can be stored in 16 bits as an unsigned value, but as a signed value you have to go to 32 bits. Unsigned integer literals are identified by the letter
U
or
u
following the digits of the value. This applies to decimal, hexadecimal, and octal integer literals. If you want to specify a literal to be type
unsigned long int
, you use both the
U
or
u
and the
L
.

Figure 2-3. Signed and unsigned integers
Figure 2-3 illustrates the difference between 16-bit signed and unsigned integers. As you’ve seen, with signed integers, the leftmost bit indicates the sign of the number. It will be 0 for a positive value and 1 for a negative value. For unsigned integers, all the bits can be treated as data bits. Because an unsigned number is always regarded as positive, there is no sign bit—the leftmost bit is just part of the number.
If you think that the binary value for –32768 looks strange, remember that negative values are normally represented in 2’s complement form. As you'll see if you look in Appendix E, to convert a positive binary value to a negative binary value (or vice versa) in 2’s complement form, you just flip all the bits and then add 1. Of course, you can’t represent +32768 as a 16-bit signed integer, as the available range only runs from –32768 to +32767.
Signed and Unsigned Char Types
Values stored as type
char
may actually be signed or unsigned, depending on how your compiler chooses to implement the type, so it may be vary between different computers or even between different compilers on the same computer. If you want a single byte to store integer values rather than character codes, you should explicitly declare the variable as either type
signed char
or type
unsigned char
.
Note that although type
char
will be equivalent to either
signed char
or
unsigned char
in any given compiler context, all three are considered to be different types. Of course, the words
char
,
short
,
int
,
long
,
signed
, and
unsigned
are all keywords.
Integer Ranges
The basic unit of memory in C++ is a byte. As far as C++ is concerned, a byte has sufficient bits to contain any character in the basic character set used by your C++ compiler, but it is otherwise undefined. As long as a byte can accommodate at least 96 characters, then it’s fine according to the C++ standard. This implies that a byte in C++ is at least 7 bits, but it could be more, and 8-bit bytes seem to be popular at the moment at least. The intention here is to remove hardware architecture dependencies from the standard. If at some future date there’s a reason to produce machines with 16-bit bytes, for instance, then the C++ standard will accommodate that and will still apply. For the time being, though, you should be safe in assuming that a byte is 8 bits.
As I said earlier, the memory allocated for each type of integer variable isn’t stipulated exactly within the ANSI C++ standard. What is said on the topic is the following:
A variable of type
char
occupies sufficient memory to allow any character from the basic character set to be stored, which is 1 byte.
A value of type
int
will occupy the number of bytes that’s natural for the hardware environment in which the program is being compiled.
The
signed
and
unsigned
versions of a type will occupy the same amount of memory.
A value of type
short int
will occupy at least as many bytes as type
char
; a value of type
int
will occupy at least as many bytes as type
short int
; and a value of type
long int
will occupy at least as many bytes as type
int
.
In a sentence, type
char
is the smallest with 1 byte, type
long
is the largest, and type
int
is somewhere between the two but occupies the number of bytes best suited to your computer’s integer arithmetic capability. The reason for this vagueness is that the number of bytes used for type
int
on any given computer should correspond to that which results in the most efficient integer arithmetic. This will depend on the architecture of the machine. In most machines, it’s 4 bytes, but as the performance and architecture of computer hardware advances, there’s increasing potential for it to be 8 bytes.
The actual number of bytes allocated to each integer type by your compiler will determine the range of values that can be stored. Table 2-4 shows the ranges for some typical sizes of integer variables.
Table 2-4. Ranges of Values for Integer Variables
Type Size (Bytes) Range of Values
char 1 –128 to 127
unsigned char 1 0U to 255U
short 2 –32768 to 32767
unsigned short 2 0U to 65535U
int 4 –2147483648 to 2147483647
unsigned int 4 0U to 4294967295U
long 8 –9223372036854775808L to
9223372036854775807L
unsigned long 8 0 to 18446744073709551615UL
The Type of an Integer Literal
I’ve introduced the idea of prefixes being applied to an integer value to affect the number base for the value. I’ve also informally introduced the notion of the suffixes
U
and
L
being used to identify integers as being of an
unsigned
type or of type
long
. Let’s now pin these options down more precisely and understand how the compiler will determine the type of a given integer literal.
First, Table 2-5 presents a summary of the options you have for the prefix and suffix to an integer value.
Table 2-5. Suffixes and Prefixes for Integer Values
Suffix/Prefix Description
No prefix The value is a decimal number.
Prefix of
0x
or
0X The value is a hexadecimal number.
Prefix of
0
(a zero) The value is an octal number.
Suffix of
u
or
U The value is of an
unsigned
type.
Suffix of
L
or
l
(lowercase L) The value is of type
long.
The last two items in the table can be combined in any sequence or combination of upper- and lowercase
U
and
L
, so
UL
,
LU
,
uL
,
Lu
, and so on are all acceptable. Although you can use the suffix
l
, which is a lowercase L, you should avoid doing so because of the obvious potential for confusion with the digit 1.
Now let’s look at how the various combinations of prefixes and suffixes that you can use with integer literals will be interpreted by the compiler:
- A decimal integer literal with no suffix will be interpreted as being of type
int
if it can be accommodated within the range of values provided by that type. Other wise, it will be interpreted as being of type
long.
-
An octal or hexadecimal literal with no suffix will be interpreted as the first of the types
int
,
unsigned int
,
long
, and
unsigned long
in which the value can be accommodated.
- A literal with a suffix of
u
or
U
will be interpreted as being of type
unsigned int
if the value can be accommodated within that type. Otherwise, it will be inter preted as type
unsigned long.
-
A literal with a suffix of
l
or
L
will be interpreted as being of type
long
if the value can be accommodated within that type. Otherwise, it will be interpreted as type
unsigned long.
-
A literal with a suffix combining both
U
and
L
in upper- or lowercase will be interpreted as being of type
unsigned long.
If the value for a literal is outside the range of the possible types, then the behavior is undefined but will usually result in an error message from the compiler.
You’ll undoubtedly have noticed that you have no way of specifying an integer literal to be of type
short int
or
unsigned short int
. When you supply an initial value in a declaration for a variable of either of these types, the compiler will automatically convert the value of the literal to the required type, for example:
unsigned short n = 1000;
Here, according to the preceding rules, the literal will be interpreted as being of type
int
. The compiler will convert the value to type
unsigned short
and use that as the initial value for the variable. If you used
-1000
as the initial value, this couldn’t be converted to type
unsigned short
because negative numbers are by definition outside the range of this type. This would undoubtedly result in an error message from the compiler.
Remember that the range of values that can be stored for each integer type is dependent on your compiler. Table 2-4 shows “typical” values, but your compiler may well allocate different amounts of memory for particular types, thus providing for different ranges of values. You also need to be conscious of the possible variations in types when porting an application from one system to another.
So far, I’ve largely ignored character literals and variables of type
char
. Because these have some unique characteristics, you’ll deal with character literals and variables that store character codes later in this chapter and press on with integer calculations first. In particular, you need to know how to store a result.