Fundamentals (of Linux Networking)

This article discusses the various types of networks, the methods for connnecting networks, how network data is moved from network to network, and the protocols used on today’s popular networks. It is excerpted from chapter one of the book The Definitive Guide to Linux Networking Programming, written by Keir Davis et. al. (Apress, 2004; ISBN: 1590593227).

Networks and Protocols

NETWORKS CAME INTO EXISTENCE AS SOON as there was two of something: two cells, two animals and, obviously, two computers. While the overwhelming popularity of the Internet leads people to think of networks only in a computer context, a network exists anytime there is communication between two or more parties. The differences between various networks are matters of implementation, as the intent is the same: communication. Whether it is two people talking or two computers sharing information, a network exists. The implementations are defined by such aspects as medium and protocol. The network medium is the substance used to transmit the information; the protocol is the common system that defines how the information is transmitted.

In this chapter, we discuss the types of networks, the methods for connecting networks, how network data is moved from network to network, and the protocols used on today’s popular networks. Network design, network administration, and routing algorithms are topics suitable for an entire book of their own, so out of necessity we’ll present only overviews here. With that in mind, let’s begin.

Circuits vs. Packets

In general, there are two basic types of network communications: circuit-switched and packet-switched. Circuit-switched networks are networks that use a dedicated link between two nodes, or points. Probably the most familiar example of a circuit-switched network is the legacy telephone system. If you wished to make a call from New York to Los Angeles, a circuit would be created between point A (New York) and point B (Los Angeles). This circuit would be dedicated—that is, there would be no other devices or nodes transmitting information on that network and the resources needed to make the call possible, such as copper wiring, modulators, and more would be used for your call and your call only. The only nodes transmitting would be the two parties on each end.

One advantage of a circuit-switched network is the guaranteed capacity. Because the connection is dedicated, the two parties are guaranteed a certain amount of transmission capacity is available, even though that amount has an upper limit. A big disadvantage of circuit-switched networks, however, is cost. Dedicating resources to facilitate a single call across thousands of miles is a costly proposition, especially since the cost is incurred whether or not anything is transmitted. For example, consider making the same call to Los Angeles and getting an answering machine instead of the person you were trying to reach. On a circuit-switched network, the resources are committed to the network connection and the costs are incurred even though the only thing transmitted is a message of unavailability.

A packet-switched network uses a different approach from a circuit-switched network. Commonly used to connect computers, a packet-switched network takes the information communicated on the network and breaks it into a series of packets, or pieces. These packets are then transmitted on a common network. Each packet consists of identification information as well as its share of the larger piece of information. The identification information on each packet allows a node on the network to determine whether the information is destined for it or the packet should be passed along to the next node in the chain. Once the packet arrives at its destination, the receiver uses the identification portion of the packet to reassemble the pieces and create the complete version of the original information. For example, consider copying a file from one computer in your office to another. On a packet-switched network, the file would be split into a number of packets. Each packet would have specific identification information as well as a portion of the file. The packets would be sent out onto the network, and once they arrived at their destination, they would be reassembled into the original file.

Unlike circuit-switched networks, the big advantage of packet-switched networks is the ability to share resources. On a packet-switched network, many nodes can exist on the network, and all nodes can use the same network resources as all of the others, sharing in the cost. The disadvantage of packet-switched networks, however, is the inability to guarantee capacity. As more and more nodes sharing the resources try to communicate, the portion of the resources available to each node decreases.

Despite their disadvantages, packet-switched networks have become the de facto standard whenever the term “network” is used. Recent developments in networking technologies have decreased the price point for capacity significantly, making a network where many nodes or machines can share the same resources cost-effective. For the purposes of discussion in this book, the word “network” will mean a packet-switched network.

Internetworking

A number of different technologies exist for creating networks between computers. The terms can be confusing and in many cases can mean different things depending on the context in which they’re used. The most common network technology is the concept of a local area network, or LAN. A LAN consists of a number of computers connected together on a network such that each can communicate with any of the others. A LAN typically takes the form of two or more computers joined together via a hub or switch, though in its simplest form two computers connected directly to each other can be called a LAN as well. When using a hub or switch, the ability to add computers to the network becomes trivial, requiring only the addition of another cable connecting the new node to the hub or switch. That’s the beauty of a packet-switched network, for if the network were circuit-switched, we would have to connect every node on the network to every other node, and then figure out a way for each node to determine which connection to use at any given time.

LANs are great, and in many cases they can be all that’s needed to solve a particular problem. However, the advantages of a network really become apparent when you start to connect one network to another. This is called internetworking, and it forms the basis for one of the largest known networks: the Internet. Consider the following diagrams. Figure 1-1 shows a typical LAN.


Figure 1-1.  A single network

You can see there are a number of computers, or nodes, connected to a common point. In networking parlance, this is known as a star configuration. This type of LAN can be found just about anywhere, from your home to your office, and it’s responsible for a significant portion of communication activity every day. But what happens if you want to connect one LAN to another?

As shown in Figure 1-2, connecting two LANs together forms yet another network, this one consisting of two smaller networks connected together so that information can be shared not only between nodes on a particular LAN, but also between nodes on separate LANs via the larger network.

Figure 1-2.  Two connected networks

Because the network is packet-switched, you can keep connecting networks together forever or until the total number of nodes on the network creates too much traffic and clogs the network. Past a certain point, however, more involved network technologies beyond the scope of this book are used to limit the traffic problems on interconnected networks and improve network efficiency. By using routers, network addressing schemes, and long-haul transmission technologies such as dense wavelength division multiplexing (DWDM) and long-haul network protocols such as asynchronous transfer mode (ATM), it becomes feasible to connect an unlimited number of LANs to each other and allow nodes on these LANs to communicate with nodes on remote networks as if they were on the same local network, limiting packet traffic problems and making network interconnection independent of the supporting long-distance systems and hardware. The key concept in linking networks together is that each local network takes advantage of its packet-switched nature to allow communication with any number of other networks without requiring a dedicated connection to each of those other networks.

Ethernets

Regardless of whether we’re talking about one network or hundreds of networks connected together, the most popular type of packet-switched network is the Ethernet. Developed 30 years ago by Xerox PARC and later standardized by Xerox, Intel, and Digital Equipment Corporation, Ethernets originally consisted of a single cable connecting the nodes on a network. As the Internet exploded, client-server computing became the norm, and more and more computers were linked together, a simpler, cheaper technology known as twisted pair gained acceptance. Using copper conductors much like traditional phone system wiring, twisted pair cabling made it even cheaper and easier to connect computers together in a LAN. A big advantage to twisted pair cabling is that, unlike early Ethernet cabling, a node can be added or removed from the network without causing transmission problems for the other nodes on the network.

A more recent innovation is the concept of broadband. Typically used in connection with Internet access via cable TV systems, broadband works by multiplexing multiple network signals on one cable by assigning each network signal a unique frequency. The receivers at each node of the network are tuned to the correct frequency and receive communications on that frequency while ignoring communications on all the others.

A number of alternatives to Ethernet for local area networking exist. Some of these include IBM’s Token Ring, ARCNet, and DECNet. You might encounter one of these technologies, as Linux supports all of them, but in general the most common is Ethernet.

Ethernet Frames

On your packet-switched Ethernet, each packet of data can be considered a frame. An Ethernet frame has a specific structure, though the length of the frame or packet is variable, with the minimum length being 64 bytes and the maximum length being 1518 bytes, although proprietary implementations can extend the upper limit to 4096 bytes or higher. A recent Ethernet specification called Jumbo Frames even allows frame sizes as high as 9000 bytes, and newer technologies such as version 6 of the Internet Protocol (discussed later) allow frames as large as 4GB. In practice, though, Ethernet frames use the traditional size in order to maintain compatibility between different architectures.

Because the network is packet-based, each frame must contain a source address and destination address. In addition to the addresses, a typical frame contains a preamble, a type indicator, the data payload, and acyclic redundancy checksum (CRC). The preamble is 64 bits long and typically consists of alternating 0s and 1s to help network nodes synchronize transmissions. The type indicator is 16 bits long, and the CRC is 32 bits. The remaining bits in the packet consist of the actual packet data being sent (see Figure 1-3).

Figure 1-3.  An Ethernet frame

The type field is used to identify the type of data being carried by the packet. Because Ethernet frames have this type indicator, they are known as self-identifying. The receiving node can use the type field to determine the data contained in the packet and take appropriate action. This allows the use of multiple protocols on the same node and the same network segment. If you wanted to create your own protocol, you could use a frame type that did not conflict with any others being used, and your network nodes could communicate freely without interrupting any of the existing communications.

The CRC field is used by the receiving node to verify that the packet of data has been received intact. The sender computes the CRC value and adds it to the packet before sending the packet. On the receiving end, the receiver recalculates the CRC value for the packet and compares it to the value sent by the sender to confirm the packet was received intact.

{mospagebreak title=Addressing} 

We’ve discussed the concept of two or more computers communicating over a network, and we’ve discussed the concept of abstracting the low-level concerns of internetworking so that as far as one computer is concerned, the other computer could be located nearby or on the other side of the world. Because every packet contains the address of the source and the destination, the actual physical distance between two network nodes really doesn’t matter, as long as a transmission path can be found between them. Sounds good, but how does one computer find the other? How does one node on the network “call” another node?

For communication to occur, each node on the network must have its own address. This address must be unique, just as someone’s phone number is unique. For example, while two or more people might have 555-9999 as their phone number, only one person will have that phone number within a certain area code, and that area code will exist only once within a certain country code. This accomplishes two things: it ensures that within a certain scope each number is unique, and it allows each person with a phone to have a unique number.

Ethernet Addresses

Ethernets are no different. On an Ethernet, each node has its own address. This address must be unique to avoid conflicts between nodes. Because Ethernet resources are shared, every node on the network receives all of the communications on the network. It is up to each node to determine whether the communication it receives should be ignored or answered based on the destination address. It is important not to confuse an Ethernet address with a TCP/IP or Internet address, as they are not the same. Ethernet addresses are physical addresses tied directly to the hardware interfaces connected via the Ethernet cable running to each node.

An Ethernet address is an integer with a size of 48 bits. Ethernet hardware manufacturers are assigned blocks of Ethernet addresses and assign a unique address to each hardware interface in sequence as they are manufactured. The Ethernet address space is managed by the Institute of Electrical and Electronics Engineers (IEEE). Assuming the hardware manufacturers don’t make a mistake, this addressing scheme ensures that every hardware device with an Ethernet interface can be addressed uniquely. Moving an Ethernet interface from one node to another or changing the Ethernet hardware interface on a node changes the Ethernet address for that node. Thus, Ethernet addresses are tied to the Ethernet device itself, not the node hosting the interface. If you purchase a network card at your local computer store, that network card has a unique Ethernet address on it that will remain the same no matter which computer has the card installed.

Let’s look at an example using a computer running Linux.

 [user@host user]$ /sbin/ifconfig eth0 eth0     Link encap:Ethernet HWaddr 00:E0:29:5E:FC:BE
         inet addr:192.168.2.1 Bcast:192.168.2.255 Mask:255.255.255.0
         UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
         RX packets:35772 errors:0 dropped:0 overruns:0 frame:0
         TX packets:24414 errors:0 dropped:0 overruns:0 carrier:0
         collisions:0 txqueuelen:100
         RX bytes:36335701 (34.6 Mb) TX bytes:3089090 (2.9 Mb)
         Interrupt:5 Base address:0x6000

Using the /sbin/ifconfig command, we can get a listing of the configura tion of our eth0 interface on our Linux machine. Your network interface might have a different name than eth0, which is fine. Just use the appropriate value, or use the –a option to ifconfig to get a listing of all of the configured interfaces if you don’t know the name of yours. The key part of the output, though, is the first line. Notice the parameter labeled HWaddr . In our example, it has a value of 00:E0:29:5E:FC:BE , which is the physical Ethernet address of this node. Remember that we said an Ethernet address is 48 bits. Our example address has six hex values. Each hex value has a maximum of 8 bits, or a value range from 00 to FF.

But what does this tell us? As mentioned previously, each hardware manufacturer is assigned a 24-bit value by the IEEE. This 24-bit value (3 octets) must remain consistent across all hardware devices produced by this manufacturer. The manufacturer uses the remaining 3 octets as a sequential number to create a unique, 48-bit Ethernet address. Let’s see what we can find out about our address.

Open a web browser and go to this address: http://standards.ieee.org/regauth/oui/index.shtml . In the field provided, enter the first 3 octets of our example address, in this case 00-e0-29 (substitute a hyphen [-] for the colon [:]). Click Search, and you’ll see a reply that looks like this:

00-E0-29  (hex)       STANDARD MICROSYSTEMS CORP.
00E029    (base 16)   STANDARD MICROSYSTEMS CORP.
                      6 HUGHES
                      IRVINE CA 92718
                      UNITED STATES

That’s pretty descriptive. It tells us that the hardware manufacturer of our network interface is Standard Microsystems, also known as SMC. Using the same form, you can also search by company name. To illustrate how important it is that these numbers be managed, try searching with a value similar to 00-e0-29, such as 00-e0-27. Using 27, you’ll find that the manufacturer is Dux, Inc. Thus, as each manufacturer is creating their products, they’ll increase the second half of the Ethernet address sequentially to ensure that each device has a unique value. In our case, the second half of our address is 5E-FC-BE, which is our hardware interface’s unique identifier. If the results of your search don’t match the vendor of your network card, keep in mind that many companies resell products produced by another or subcontract their manufacturing to someone else.

The Ethernet address can also take on two other special values. In addition to being the unique address of a single physical interface, it can be a broadcast address for the network itself as well as a multicast address. The broadcast address is reserved for sending to all nodes on a network simultaneously. Multicast addresses allow a limited form of broadcasting, where a subset of network nodes agrees to respond to the multicast address.

The Ethernet address is also known as the MAC address. MAC stands for Media Access Control. Because our Ethernet is a shared network, only one node can “talk” at any one time using the network. Before a node transmits information, it first “listens” to the network to see if any other node is using the network. If so, it waits a randomly chosen amount of time and then tries to communicate again. If no other node is using the network, our node sends its message and awaits a reply. If two nodes “talk” at the same time, a collision occurs. Collisions on shared networks are normal and are handled by the network itself so as not to cause problems, provided the ratio of collisions to communications does not get too high. In the case of Ethernets, a collision rate higher than 60 percent is typically cause for concern. Each MAC address must be unique, so a node about to transmit can compare addresses to check whether another node is already transmitting. Thus, the MAC address (Ethernet address) helps control the collision rate and allows nodes to determine if the network is free to use.

Gateways

We’ve discussed that the Internet is a network built by physically connecting other networks. To connect our networks together, we use a special device called a gateway. Any Ethernet node can conceivably act as a gateway, though many do not. Gateways have two or more physical network interfaces and do a particular job, just as their name implies: they relay packets destined for other networks, and they receive packets destined for nodes on one of their own networks. Building on our earlier diagram, here’s how it looks when you connect two networks together with a gateway (see Figure 1-4).

Figure 1-4.  Two connected networks with a gateway

Gateways can also be called routers, since they route packets from one network to another. If you consider that all networks are equal, then the notion of transmitting packets from one to the other becomes a little easier. No longer is it necessary for our network nodes to understand how to find every other node on remote networks. Maintaining that amount of ever-changing information on every computer connected to every network would be impossible. Instead, nodes on our local network only need to know the address of the gateway. That is, local nodes only need to know which node is the “exit” or “gate” to all other networks. The gateway takes on the task of correctly routing packets with foreign destinations to either the remote network itself or another gateway. For example, consider Figure 1-5, which shows three interconnected networks.

Figure 1-5.  Multiple networks, multiple gateways

In this diagram, we have three networks: Red, Green, and Blue. There are two gateways, Foo and Bar. If a node on the Red network wants to send a packet to a node on the Green or Blue network, it does not need to keep track of the addresses on either network. It only needs to know that its gateway to any other network besides its own is Foo. The packets destined for the remote network are sent to Foo, which then determines whether the packet is destined for the Green network or the Blue network. If Green, the packet is sent to the appropriate node on the Green network. If Blue, however, the packet is sent to Bar, because Foo only knows about the Red and Green networks. Any packet for any other network needs to be sent to the next gateway, in this case Bar. This scenario is multiplied over and over and over in today’s network environment, and it significantly decreases the amount of information that each network node and gateway has to manage.

Likewise, the reverse is true. When the receiver accepts the packet and replies, the same decision process occurs. The sender determines if the packet is destined for its own network or a remote network. If remote, then the packet is sent to the network’s gateway, and from there to either the receiver or yet another gateway. Thus, a gateway is a device that transmits packets from one network to another.

Gateways seem simple, but as we’ve mentioned, asking one device to keep track of the information for every network that’s connected to every other network is impossible. So how do our gateways do their job without becoming hopelessly buried by information? The gateway rule is critical: network gateways route packets based on destination networks, not destination nodes.

Thus, our gateways aren’t required to know how to reach every node on all the networks that might be connected. A particular set of networks might have thousands of nodes on it, but the gateway doesn’t need to keep track of all of them. Gateways only need to know which node on their own network will move packets from their network to some other network. Eventually, the packets reach their destination network. Since all devices on a particular network check all packets to see if the packets are meant for them, the packets sent by the gateway will automatically get picked up by the destination host without the sender needing to know any specifics except the address of its own gateway. In short, a node sending data needs to decide one thing: whether the data is destined for a local network node or remote network node. If local, the data is sent between the two nodes directly. If remote, the data is sent to the gateway, which in turn makes the same decision until the data eventually gets picked up by the recipient.

{mospagebreak title=Internet Addresses}

So far, you’ve seen that on a particular network, every device must have a unique address, and you can connect many networks together to form a larger network using gateways. Before a node on your network “talks,” it checks to see if anyone else is “talking,” and if not, it goes ahead with its communication. Your networks are interconnected, though! What happens if the nodes on your office LAN have to wait for some node on a LAN in Antarctica to finish talking before they can talk? Nothing would ever be sent—the result would be gridlock! How do you handle the need to identify a node with a unique address on interconnected networks while at the same time isolating your own network from every other network? Unless one of your nodes has a communication for another node on another network, there should be no communication between networks and no need for one to know of the existence of the other until the need to communicate exists. You handle the need for a unique address by assigning protocol addresses to physical addresses in conjunction with your gateways. In our scenario, these protocol addresses are known as Internet Protocol (IP) addresses.

IP addresses are virtual. That is, there is no required correlation between a particular IP address and its physical interface. An IP address can be moved from one node to another at will, without requiring anything but a software configuration change, whereas changing a node’s physical address requires changing the network hardware. Thus, any node on an internet has both a physical Ethernet address (MAC address) and an IP address.

Unlike an Ethernet address, an IP address is 32 bits long and consists of both a network identifier and a host identifier. The network identifier bits of the IP addresses for all nodes on a given network are the same. The common format for listing IP addresses is known as dotted quad notation because it divides the IP address into four parts (see Table 1-1). The network bits of an IP address are the leading octets, and the address space is divided into three classes: Class A, Class B, and Class C. Class A addresses use just 8 bits for the network portion, while Class B addresses use 16 bits and Class C addresses use 24 bits.

Table 1-1. Internet Protocol Address Classes

CLASS

DESCRIPTION

NETWORK BITS

HOST BITS

A

Networks 1.0.0.0 through 127.0.0.0

8

24

B

Networks 128.0.0.0 through 191.255.0.0

16

16

C

Networks 192.0.0.0 through 223.255.255.0

24

8

D

Multicast (reserved)

E

Reserved for future use

Let’s look at an example. Consider the IP address . From Table 1-1, you can tell that this is a Class C address. Since it is a Class C address, you know that the network identifier is 24 bits long and the host identifier is 8 bits long. This translates to “the node with address 1 on the network with address .” Adding a host to the same Class C network would require a second address with the same network identifier, but a different host identifier, such as , since every host on a given network must have a unique address.

You may have noticed that the table doesn’t include every possible value. This is because the octets 0 and 255 are reserved for special use. The octet 0 (all 0s) is the address of the network itself, while the octet 255 (all 1s) is called the broadcast address because it refers to all hosts on a network simultaneously. Thus, in our Class C example, the network address would be 192.168.2.0 , and the broadcast address would be 192.168.2.255 . Because every address range needs both a network address and a broadcast address, the number of usable addresses in a given range is always 2 less than the total. For example, you would expect that on a Class C network you could have 256 unique hosts, but you cannot have more than 254, since one address is needed for the network and another for the broadcast.

In addition to the reserved network and broadcast addresses, a portion of each public address range has been set aside for private use. These address ranges can be used on internal networks without fear of conflicts. This helps alleviate the problem of address conflicts and shortages when public networks are connected together. The address ranges reserved for private use are shown in Table 1-2.

Table 1-2. Internet Address Ranges Reserved for  
               Private Use

CLASS

RANGE

A

10.0.0.0 through 10.255.255.255

B

172.16.0.0 through 172.31.0.0

C

192.168.0.0 through 192.168.255.0

If you know your particular network will not be connected publicly, you are allowed to use any of the addresses in the private, reserved ranges as you wish. If you do this, however, you must use software address translation to connect your private network to a public network. For example, if your office LAN uses as its network, your company’s web server or mail server cannot use one of those addresses, since they are private. To connect your private network to a public network such as the Internet, you would need a public address for your web server or mail server. The private addresses can be “hidden” behind a single public address using a technique called (NAT), where an entire range of addresses is translated into a single public address by the private network’s gateway. When packets are received by the gateway on its public interface, the destination address of each packet is converted back to the private address. The public address used in this scenario could be one assigned dynamically by your service provider, or it could be from a range of addresses to your network, also by your service provider. When a network address range is delegated, it means that your gateway takes responsibility for routing that address range and receiving packets addressed to the network.

Another IP address is considered special. This IP address is known as the loopback address, and it’s typically denoted as 127.0.0.1 . The loopback address is used to specify the local machine, also known as localhost. For example, if you were to open a connection to the address 127.0.0.1 , you would be opening a network connection to yourself. Thus, when using the loopback address, the sender is the receiver and vice versa. In fact, the entire 127.0.0.0 network is considered a reserved network for loopback use, though anything other than 127.0.0.1 is rarely used.

Ports

The final component of IP addressing is the port. Ports are virtual destination “points” and allow a node to conduct multiple network communications simultaneously. They also provide a standard way to designate the point where a node can send or receive information. Conceptually, think of ports as “doors” where information can come and go from a network node.

On Linux systems, the number of ports is limited to 65,535, and many of the lower port numbers are reserved, such as port 80 for web servers, port 25 for sending mail, and port 23 for telnet servers. Ports are designated with a colon when describing an IP address and port pair. For example, the address 10.0.0.2:80 can be read as “port 80 on the address 10.0.0.2 ,” which would also mean “the web server on 10.0.0.2 ” since port 80 is typically used by and reserved for web services. Which port is used is up to the discretion of the developer, provided the ports are not already in use or reserved. A list of reserved ports and the names of the services that use them can be found on your Linux system in the /etc/services file, or at the Internet Assigned Numbers Authority (IANA) site listed here: http://www.iana.org/assignments/port-numbers . Table 1-3 contains a list of commonly used (and reserved) ports.

Table 1-3. Commonly Used Ports

PORT

SERVICE

21

File Transfer Protocol (FTP)

22

Secure Shell (SSH)

23

Telnet

25

Simple Mail Transfer Protocol (SMTP)

53

Domain Name System (DNS)

80

Hypertext Transfer Protocol (HTTP)

110

Post Office Protocol 3 (POP3)

143

Internet Message Access Protocol (IMAP)

443

Hypertext Transfer Protocol Secure (HTTPS)

Without ports, a network host would be allowed to provide only one network service at a time. By allowing the use of ports, a host can conceivably provide moer than 65,000 services at any time using a given IP address, assuming each service is offered on a different port. We cover using ports in practice when writing code first in Chapter 2 and then extensively in later chapters.

This version of IP addressing is known as version 4, or IPv4. Because the number of available public addresses has been diminishing with the explosive growth of the Internet, a newer addressing scheme has been developed and is slowly being implemented. The new scheme is known as version 6, or IPv6. IPv6 addresses are 128 bits long instead of the traditional 32 bits, allowing for 2^96 more network nodes than IPv4 addresses. For more on IPv6, consult Appendix A.

Network Byte Order

One final note on IP addressing. Because each hardware manufacturer can develop its own hardware architecture, it becomes necessary to define a standard data representation for data. For example, some platforms store integers in what is known as Little Endian format, which means the lowest memory address contains the lowest order byte of the integer (remember that addresses are 32-bit integers). Other platforms store integers in what is known as Big Endian format, where the lowest memory address holds the highest order byte of the integer. Still other platforms can store integers in any number of ways. Without standardization, it becomes impossible to copy bytes from one machine to another directly, since doing so might change the value of the number.

In an internet, packets can carry numbers such as the source address, destination address, and packet length. If those numbers were to be corrupted, network communications would fail. The Internet protocols solve this byte-order problem by defining a standard way of representing integers called network byte order that must be used by all nodes on the network when describing binary fields within packets. Each host platform makes the conversion from its local byte representation to the standard network byte order before sending a packet. On receipt of a packet, the conversion is reversed. Since the data payload within a packet often contains more than just numbers, it is not converted.

The standard network byte order specifies that the most significant byte of an integer is sent first (Big Endian). From a developer’s perspective, each platform defines a set of conversion functions that can be used by an application to handle the conversion transparently, so it is not necessary to understand the intricacies of integer storage on each platform. These conversion functions, as well as many other standard network programming functions, are covered in Chapter 2.

{mospagebreak title=Internet Protocol}

So far, we’ve discussed building a network based on Ethernet. We’ve also discussed connecting two or more networks together via a gateway, called an internet, and we’ve covered the basic issues surrounding network and host addressing that allow network nodes to communicate with each other without conflicts. Yet how are all of these dissimilar networks expected to communicate efficiently without problems? What is it that lets one network look the same as any other network?

A protocol exists that enables packet exchanges between networks as if the connected networks were a single, homogenous network. This protocol is known as the Internet Protocol, or IP, and was defined by RFC 791 in September 1981. These interconnected networks are known as internets, not to be confused with the Internet. The Internet is just one example of a global internet, albeit the most popular and the most well known. However, this does not preclude the existence of other internets that use the same technologies, such as IP.

Because IP is hardware independent, it requires a hardware-independent method of addressing nodes on a network, which is the IP addressing system already discussed. In addition to being hardware independent and being a packet-switching technology, IP is also connectionless. IP performs three key functions in an internet:

  • It defines the basic unit of data transfer.
  • It performs the routing function used by gateways and routers to determine which path a packet will take.
  • It uses a set of rules that allow unreliable packet delivery.

These rules determine how hosts and gateways on connected networks should handle packets, whether a packet can be discarded, and what should happen if things go wrong.

Like the physical Ethernet frame that contains data as well as header information, the basic unit of packet transfer used by IP is called the Internet datagram. This datagram also consists of both a header portion and a data portion. Table 1-4 lists the format of the IP datagram header, along with the size of each field within the header.

Table 1-4. IP Datagram Header Format

FIELD

SIZE

DESCRIPTION

VERS

4 bits

The version of IP used to create this

datagram

HLEN

4 bits

The length of the datagram header,

 

 

measured in 32-bit words

SERVICE TYPE

8 bits

Specifies how the datagram should be

 

 

handled

TOTAL LENGTH

16 bits

The total length of the datagram,

 

 

measured in octets

IDENTIFICATION

16 bits

A unique integer generated by the

 

 

sender that allows accurate datagram

 

 

reassembly

FLAGS

3 bits

Various control flags, such as whether

 

 

the datagram may be fragmented

FRAGMENT OFFSET

13 bits

Indicates where in the datagram this

 

 

fragment belongs

TIME TO LIVE

8 bits

Specifies how long, in seconds, the

 

 

datagram is allowed to exist

PROTOCOL

8 bits

Specifies which higher-level protocol

 

 

was used to create the data in the data

 

 

portion of the datagram

HEADER CHECKSUM

16 bits

Checksum for the header, recomputed

 

 

and verified at each point where the

 

 

datagram is processed

SOURCE ADDRESS

32 bits

IP address of the sender

DESTINATION ADDRESS

32 bits

IP address of the recipient

OPTIONS

Variable

Any number of various options, typically

 

 

used for testing and debugging

PADDING

Variable

Contains the number of zero bits

 

 

needed to ensure the header size is an

 

 

exact multiple of 32 bits


Figure 5-6.  IP datagram encapsulation in an Ethernet frame

Most of these fields look pretty similar to the description of an Ethernet frame. What is the relationship between Ethernet frames, or packets, and IP datagrams? Remember that IP is hardware independent and that Ethernet is hardware. Thus, the IP datagram format must be independent of the Ethernet frame specification. In practice, the most efficient design would be to carry one IP datagram in every Ethernet frame. This concept of carrying a datagram inside a lower-level network frame is called . When an IP datagram is encapsulated within an Ethernet frame, it means the entire IP datagram, including header, is carried within the portion of the Ethernet frame, as shown in Figure 1-6.

We’ve said that an Ethernet frame has a maximum size of 1500 octets. Yet an IP datagram has a maximum total length of 16 bits. A 16-bit number can represent a data size of up to 65,535 octets and could potentially be much higher. How do we cram 65,535 octets into a network frame that maxes out at 1500? By using a technique called fragmentation.

Fragmentation is necessary in network protocols because the goal should be to hide the underlying hardware used to create the network. In our case, it’s Ethernet, but in practice it could be any number of different systems, past or future. It wouldn’t make sense to require changes to higher-level protocols every time someone invented a new hardware network technology, so to be universally compatible, the designers of IP incorporated the ability to split IP datagrams into fragments, assigning one fragment per network frame in the most efficient way possible.

The IP protocol does not guarantee that large datagrams will be delivered without fragmentation, nor does it limit datagrams to some smaller size. The sending node determines the appropriate datagram size and performs the fragmentation, while the receiving node performs the reassembly. The reassembly is made possible by the fragment offset field of the datagram header, which tells the receiver where this particular fragment should go. When the datagram is fragmented, each fragment carries a header that is essentially a duplicate of the original datagram header, with some minor changes. The fragment’s header differs because, if there are more fragments, the “more fragments” flag is set, and the fragment offset will change on each fragment to prevent overwriting.

Thus, an IP datagram of 4000 octets might get fragmented into three Ethernet frames, two containing the maximum data size and the third containing what’s left. On the receiving end, these fragments would be reassembled into the original datagram and would be processed. If our physical network had a smaller frame size than Ethernet, we would get more fragments, and less fragments (or no fragments at all) with a larger frame size.


NOTE 
Gateways are responsible for converting packets from one frame size to another.

Every network has a maximum transfer unit, or MTU. The MTU can be any size. If your packets are sent from a network with a large MTU value to a network with a smaller value (or vice versa), the gateway between the two is responsible for reformatting the packets to comply with each network’s specifications. For example, say you had a gateway with an Ethernet interface and a Token Ring interface. The MTU on one network is 1500 octets, while the MTU on the Token Ring network might be larger or smaller. It is the gateway’s responsibility to reformat and fragment the packets again when moving from one network to another. The downside to this is that once fragmented to accommodate the smaller MTU, the packets aren’t reassembled until they reach their destination. Thus, the receiving node will receive datagrams that are fragmented according to the network with the smallest MTU used in the transfer. This can be somewhat inefficient, since after traversing a network with a small MTU, the packets might traverse a network with a much larger MTU without being reformatted to take advantage of the larger frame size. This minor inefficiency is a good trade-off, however, because the gateways don’t need to store or rebuild packet fragments, and the packets can be sent using the best path without concern for reassembly problems for the destination node.

{mospagebreak title=Protocol Layering}

So far we’ve discussed the underlying physical hardware of a network, and in the case of an internet, the protocol used to ensure compatibility between different networks. More than one protocol exists, however. For example, take the acronym “TCP/IP.” You already know what “IP” stands for: Internet Protocol. But what about “TCP”? What about other protocols we use on our networks, such as HTTP or FTP? How do these protocols relate to each other if they’re not all the same?

It’s practically impossible to create a single protocol that can handle every issue that might be encountered on a network. Consider security, packet loss, hardware failure, network congestion, and data corruption. These issues and more need to be addressed in any networked system, but it can’t be done with just a single “super” protocol. The solution, then, is to develop a system in which complementary protocols, each handling a specific task, work together in a standardized fashion. This solution is known as protocol layering.

Imagine the different protocols involved in network communications stacked on top of each other in layers. This is also known as a protocol stack or stack. Each layer in a stack takes responsibility for a particular aspect of sending and receiving information on a network, with each layer in the stack working in concert with the other layers (see Figure 1-7).


Figure 1-7.  Protocol layers

As shown in Figure 1-7, sending information to another computer means sending the information “down” through a stack, over the network, and then “up” through the stack on the receiving node. When the receiving node wishes to send a response, the roles are reversed: it becomes a sender and the process is repeated. Each layer on each node is the same. That is, the protocol in layer 3 on the sender is the same protocol in layer 3 on the receiver. Thus, a protocol layer is designed so that layer n at the destination receives essentially the same datagram or packet sent by layer n at the source. We say “essentially” because datagrams have components like time to live fields that will be changed by each node involved in the transfer, even though the core data payload should remain identical from sender to receiver.

Protocol Layer Models

The dominating standard for a protocol layer is from the International Organization for Standardization (ISO) and is known as the Open Systems Interconnection reference model, or simply the OSI model. The OSI model describes seven specific layers: Application, Presentation, Session, Transport, Network, Data Link, and Physical Hardware. A description of each layer is shown in Table 1-5.

Table 1-5. OSI Seven-Layer Reference Model

LAYER

NAME

DESCRIPTION

7

Application

Application programs such as web browsers and

 

 

file transfer programs

6

Presentation

Standard routines such as compression

 

 

functions used by layer 7

5

Session

Establishes transmission control between two

 

 

nodes, regulating which node can transmit and

 

 

how long it can transmit

4

Transport

Provides additional reliability checks to those

 

 

performed by lower layers

3

Network

Defines the basic unit of communication across

 

 

a network, including addressing and routing

2

Data Link

Controls how data is sent between nodes, such

 

 

as defining frames and frame boundaries

1

Physical Hardware

Controls the physical aspects of a network

 

 

connection, such as voltage

Figure 1-8 shows the result of applying the OSI model to our earlier layer diagram.


Figure 1-8.  The OSI model

In practice, though, the typical protocol stack found in most networked environments today is known as a TCP/IP stack and, while perfectly compatible with the OSI model, it is conceptually different. The “TCP” in TCP/IP means Transmission Control Protocol and will be discussed later in this chapter. Just by the name alone, you can see that today’s networks require multiple protocols working together to function.

In a TCP/IP environment, the network transport is relatively simple, while the nodes on the network are relatively complex. TCP/IP requires all hosts to involve themselves in almost every network function, unlike some other networking protocols. TCP/IP hosts are responsible for end-to-end error checking and recovery, and also make routing decisions since they must choose the appropriate gateway when sending packets.

Using our OSI diagram as a basis, a corresponding diagram describing a TCP/IP stack would look like Figure 1-9.


Figure 1-9.  The TCP/IP layer model

This diagram shows a TCP/IP stack as having four layers versus the seven layers in an OSI model. There are fewer layers because the TCP/IP model doesn’t need to describe requirements that are needed by older networking protocols like X.25, such as the Session layer. Looking at our TCP/IP stack, we can see that Ethernet is our Network Interface, IP is our Internet layer, and TCP is our Transport layer. The Application layer, then, consists of the applications that use the network, such as a web browser, file transfer client, or other network-enabled applications.

There are two boundaries in our TCP/IP stack that describe the division of information among the application, the operating system, and the network. These boundaries correspond directly to the addressing schemes already discussed. In the Application layer, the application needs to know nothing other than the IP address (and port) of the receiver. Specifics such as datagram fragmentation, checksum calculations, and delivery verification are handled in the operating system by the Transport and Internet layers. Once the packets move from the Internet layer to the Network layer, only physical addresses are used.

At first it would seem like a lookup must be performed to get the physical address of the receiver when starting communications, but this would be incorrect. Remembering the gateway rule, the only physical address that needs to be known by the sender is the physical address of the destination if the destination is on the same network, or the physical address of the gateway if the destination is on a remote network.

{mospagebreak title=User Datagram Protocol}

At the Internet layer in our TCP/IP protocol stack, the only information available is the address of the remote node. No other information is available to the protocol, and none is needed. However, without additional information like a port number, your receiving node is limited to conducting a single network communication at any one time. Since modern operating systems allow multiple applications to run simultaneously, you must be able to address multiple applications on the receiving node simultaneously, instead of just one. If you consider that each networked application can “listen” on one or more ports, you can see that by using an IP address and a port, you can communicate with multiple applications simultaneously, up to any limits imposed by the operating system and protocol stack.

In the TCP/IP protocol stack, there are two protocols that provide a mechanism that allows applications to communicate with other applications using ports. One is the Transmission Control Protocol (TCP), which we will discuss in the next section, and the other is the User Datagram Protocol (UDP). UDP makes no guarantee of packet delivery. UDP datagrams can be lost, can arrive out of sequence, can be copied many times, and can be sent faster than the receiving node can process them. Thus, an application that uses UDP takes full responsibility for controlling message loss, reliability, sequencing, and loss of connection. This can be both an advantage and a disadvantage to developers, for while UDP is a lightweight protocol that can be used quickly, the additional application overhead needed to thoroughly manage packet transfer is often overlooked or poorly implemented.


TIP 
When using UDP for an application, make sure to thoroughly test your applications in real environments beyond a low-latency LAN. Many developers choose UDP and test in a LAN environment, only to find their applications are unusable when used over a larger TCP/IP network with higher latencies.

UDP datagrams have a simple format. Like other datagrams, UDP datagrams consist of a header and a data payload. The header is divided into four fields, each 16 bits in size. These fields specify the source port, the destination port, the length of the datagram, and a checksum. Following the header is the data area, as shown in Figure 1-10.


Figure 1-10.  The UDP datagram format

The source port is optional. If used, it specifies the port to which replies should be sent. If unused, it should be set to zero. The length field is the total number of octets in the datagram itself, header and data. The minimum value for length is 8, which is the length of the header by itself.

The checksum value is also optional, and if unused should be set to zero. Even though it’s optional, however, it should be used, since IP doesn’t compute a checksum on the data portion of its own datagram. Thus, without a UDP checksum, there’s no other way to check for header integrity on the receiving node. To compute the checksum, UDP uses a pseudo-header, which is prepended to the datagram, followed by an octet of zeros, which is appended, to get an exact multiple of 16 bits. The entire object, pseudo-header and all, is then used to compute the checksum. The pseudo-header format is shown in Figure 1-11.


Figure 1-11.  The UDP pseudo-header

The octet used for padding is not transmitted with the UDP datagram, nor is the pseudo-header, and neither is counted when computing the length of the datagram.

A number of network services use UDP and have reserved ports. A list of some of the more popular services is shown in Table 1-6.

Table 1-6. Popular UDP Services

PORT

NAME

7

echo

13

daytime

37

time

43

whois

53

domain (DNS)

69

tftp

161

snmp

Transmission Control Protocol

The main difference between UDP and TCP is that, like IP, UDP provides no guarantee of delivery and does not use any method to ensure datagrams are received in a certain order or are transmitted at a certain rate. TCP, on the other hand, provides a mechanism known as reliable stream delivery. Reliable stream delivery guarantees delivery of a stream of information from one network node to another without duplication or data loss. TCP has a number of features that describe the interface between it and the application programs that use it:

  1. Virtual circuit: Using TCP is much like making a phone call. The sender requests a connection with the receiver. Both ends negotiate the parameters of the connection and agree on various details defining the connection. Once the connection is finalized, the applications are allowed to proceed. As far as the applications are concerned, a dedicated, reliable connection exists between the sender and the receiver, but this is an illusion. The underlying transfer mechanism is IP, which provides no delivery guarantee, but the applications are removed from dealing with IP by the TCP layer.
  2. Buffered transfer: The TCP layer, independent of the application, determines the optimal way to package the data being sent, using whatever packet sizes are appropriate. To increase efficiency and decrease network traffic, TCP typically waits, if possible, until it has a relatively large amount of data to send before sending the packet, even if the application is generating data 1 byte at a time. The receiving TCP layer delivers data to the receiving application exactly the way it was sent, so a buffer may exist at each end, independent of the application.
  3. Stream orientation: The receiving node delivers data to the receiving application in exactly the same sequence as it was sent.
  4. Full duplex: Connections provided by TCP over IP are full duplex, which means that data can be transmitted in both directions simultaneously via two independent packet streams. The streams can be used to transfer data or to send control information or commands back to the sender, and either stream can be terminated without harming the other.
  5. Unstructured stream: TCP does not guarantee the structure of the data stream, even though delivery is guaranteed. For example, TCP does not honor markers that might exist in a record set sent to or from a database. It is up to the applications to determine stream content and assemble or disassemble the stream accordingly on each end of the connection. Applications do this by buffering incoming packets when necessary and assembling them in an order that the applications recognize.

The method that TCP uses to guarantee reliable delivery can be described as confirmation and retransmission. The sender keeps track of every packet that is sent and waits for confirmation of successful delivery from the receiver before sending the next packet. The sender also sets an internal timer when each packet is sent and automatically resends the packet should the timer expire before getting confirmation from the receiver. TCP uses a sequence number to determine whether every packet has been received. This sequence number is sent on the confirmation message as well as the packet itself, allowing the sender to match confirmations to packets sent, in case network delays cause the transmission of unnecessary duplicates.

Even with full duplex, though, having to wait for a confirmation on every packet before sending the next can be horribly slow. TCP solves this by using a mechanism called a sliding window. The easiest way to imagine a TCP sliding window is to consider a number of packets that needs to be sent. TCP considers a certain number of packets to be a window and transmits all packets in that window without waiting for confirmation on each one. Once the confirmation is received for the first packet in the window, the window “slides” to contain the next packet to be sent, and it is sent. For example, if the window size was 8, then packets 1 through 8 would be sent. When the confirmation for packet 1 was received, the window would “slide” so that it covered packets 2 through 9, and the ninth packet would be sent. A packet that has been transmitted without confirmation is called an unacknowledged packet. Thus, the total number of unacknowledged packets allowed is equal to the window size. The advantage to a sliding window protocol is that it keeps the network saturated with packets as much as possible, minimizing the time spent waiting for confirmation. The window is matched on the receiving end, so that the receiver “slides” its own window according to which confirmation messages have been sent.

We noted earlier that TCP connections use what is known as a virtual circuit. Taking that abstraction further, TCP defines connections as a pair of endpoints, each endpoint consisting of an integer pair, consisting of the 32-bit integer IP address and the 16-bit integer port number. This is important, because by defining an endpoint as an integer pair, a TCP service on a given port number can be used by multiple connections at the same time. For example, even though a mail server has only one port 25 on which to receive mail, each sender making a connection offers a different integer pair because the source IP address and source port are different, allowing multiple concurrent connections on the same receiving port.

The TCP datagram is also known as a segment. Segments do all of the work: establishing and closing connections, advertising window sizes, transferring data, and sending acknowledgments. Figure 1-12 shows a diagram of a TCP segment.


Figure 1-12.  The TCP segment format

As with other datagrams, a TCP segment consists of two parts: header and data. A description of each header field is listed in Table 1-7.

Table 1-7. TCP Header Fields

NAME

SIZE

DESCRIPTION

Source port

16 bits

The port number of the sending

 

 

connection endpoint

Destination port

16 bits

The port number of the receiving

 

 

connection endpoint

Sequence number

32 bits

The position of this segment’s payload in

 

 

the sender’s stream

Acknowledgment number

32 bits

The octet that the source expects to receive

 

 

next

Header length

4 bits

The length of the header, in 32-bit

 

 

multiples

Reserved

6 bits

Reserved for future use

Code bits

6 bits

Defines the purpose and content of this

 

 

segment (see Table 1-8)

Window

16 bits

Specifies the maximum amount of data

 

 

that can be accepted

Checksum

16 bits

Verifies the integrity of the header

Urgent pointer

16 bits

Flag notifying the receiver that the segment

 

 

should be handled immediately

Options

24 bits

Various options used to negotiate between

 

 

endpoints

Padding

8 bits

Needed to accommodate an OPTIONS

 

 

value of varying length to ensure header

 

 

and data end on a 32-bit boundary

Much like the UDP header, the TCP header has fields such as checksum, source port, destination port, and length. However, the TCP header goes much farther due primarily to the need to guarantee delivery. The is used to indicate the position of the data in a particular segment within the overall byte stream being sent. If a byte stream had 100 packets in it, the sequence number field would be where each segment was numbered, so the receiver would know how to reassemble the stream.

The acknowledgment number, on the other hand, is used by the sender to identify which acknowledgment the sender expects to receive next. Thus, the sequence number refers to the byte stream flowing in the same direction as the segment, while the acknowledgment number refers to the byte stream flowing in the opposite direction as the segment. This two-way synchronization system helps ensure that both connection endpoints are receiving the bytes they expect to receive and that no data is lost in transit.

The code bits field is a special header field used to define the purpose of this particular segment. These bits instruct the endpoint how to interpret other fields in the header. Each bit can be either 0 or 1, and they’re counted from left to right. A value of 111111, then, means that all options are “on.” Table 1-8 contains a list and description of the possible values for the code bits field.

Table 1-8. Possible Values for Code Bits Header Field

NAME (LEFT TO RIGHT)

MEANING

URG

Urgent pointer field is valid.

ACK

Acknowledgment field is valid.

PSH

Push requested.

RST

Reset the connection.

SYN

Synchronize sequence numbers.

FIN

Sender has completed its byte stream.

Even though TCP is a stream-oriented protocol, situations arise when data must be transmitted out of band. Out-of-band data is sent so that the application at the other end of the connection processes it immediately instead of waiting to complete the entire stream. This is where the urgent header field comes into play. Consider a connection a user wishes to abort, such as a file transfer that occurs slower than expected. For the abort signal to be processed, the signal must be sent in a segment out of band. Otherwise, the abort signal would not be processed until the file transfer was complete. By sending the segment marked urgent, the receiver is instructed to handle the segment immediately.

{mospagebreak title=The Client-Server Model}

Let’s recap. We’ve discussed circuits versus packets; the concept of connecting one or many networks together via gateways; physical addresses; virtual addresses; and the IP, UDP, and TCP protocols. We’ve also used the terms  “sender,”  “source,”  “receiver,” and  “destination.” These terms can be confusing, because as you’ve seen already, a TCP connection is a connection of equals, and in a virtual circuit, the roles of sender, source, receiver, and destination are interchangeable depending on what sort of data is being sent. Regardless of which term is used, the key concept to remember is that applications are present at both endpoints of a TCP connection. Without compatible applications at both ends, the data sent doesn’t end up anywhere, nor can it be processed and utilized. Nevertheless, changing the terminology between “source” and “destination” to describe every communication between two endpoints can be pretty confusing. A better model is to designate roles for each endpoint for the duration of the communication. The model of interaction on a TCP connection, then, is known as the client-server model.

In the client-server model, the term “server” describes the application that offers a service that can be utilized by any other application over the network. Servers accept connections over a network, perform their service, and respond with the result. The simplest servers are those that accept a single packet and respond with a single packet, though in many cases servers are more complex. Some features common to servers include the ability to accept more than one request at a time (multiple connections) and the ability to service requests independently of other operating system processes such as user sessions.

A “client” is the application sending the request to the server and waiting for the response. Client applications typically make only one request to a particular server at any given time, though there is no restriction preventing them from making simultaneous requests, or even multiple requests to different servers.

There are many different types of client-server applications. The simplest of them use UDP to communicate, while others use TCP/IP with higher-level application protocols such as File Transfer Protocol (FTP), Hypertext Transfer Protocol (HTTP), and Simple Mail Transfer Protocol (SMTP). The specifics of a client-server system are fairly simple. On the server, the application is started, after which it negotiates with the operating system for permission to use a particular port number for accepting requests. Assuming that the application is allowed to start on the given port, it begins to listen for incoming requests. The specifics of how to write an application that will listen on a port are covered in Chapter 2. Once listening, the server executes an endless loop until stopped: receive request, process request, assemble response, send response, repeat. In the process, the server reverses the source and destination addresses and port numbers, so that the server knows where to send the response and the client also knows that the server has responded.

The client, unlike the server, typically stops sending requests once the server has responded. At this point, the client itself may terminate, or it may simply go into a wait state until another request must be sent. Ports are handled differently on clients than on servers, however. While the server typically uses a reserved port, clients do not. This is because every client must know how to reach the server, but the server does not need to know in advance how to reach every client, since that information is contained in the packets it receives from the client, in the source address and source port fields. This allows clients to use any port they wish as their endpoint in the TCP connection with the server. For example, a web page server is usually found on port 80. Even though the client must send its requests to the server’s port 80, it can use any available port as its own endpoint, such as port number 9999, 12345, 64400, or anything in between, as long as it isn’t one of the reserved ports mentioned earlier or a port already in use by another application. Thus, the two endpoints involved in a web page request might be 192.168.2.1:80 for the server endpoint and 10.0.0.4:11908 for the client endpoint. The main requirement for a client is that it knows, through configuration or some other method, the address and port of the server, since all other information needed will be sent within the packets.

{mospagebreak title=The Domain Name System}

We’ve discussed how it isn’t necessary for every node on every network to store information about how to reach all of the other nodes on the network. Because of the gateway rule, a node only needs to know how to reach nodes on its own network or a gateway. With IP addresses 32 bits long, there are plenty of addresses to go around (and when we run out, there’s always IPv6, which is covered in Appendix A). We still have a problem, though. While computers can take advantage of the gateway rule, people can’t. It’s not enough to instruct our computers to make a connection—we have to specify the other end of the connection, even if we don’t have to specify exactly how our packets will get there. We still need a system people can use to point computers to the other endpoint of our connections, without requiring them to remember nearly endless lists of numbers.

A system exists that lets us assign easily remembered names to their corresponding IP addresses, without having to remember the addresses themselves. This system is called the Domain Name System (DNS). You can think of DNS as a type of phone book for computers. Rather than remember the addresses of every network node you may want to reach, you instead can assign meaningful names to those nodes, and use DNS to translate the names that people like to use into the numbers that computers must use. To contact the other computer, your computer performs a DNS lookup, the result of which is the address of the other computer. Once the address is known, a connection can be made. Performing a DNS lookup is much like using a phone book. You know the name of the other computer, and you want to find the number, or address. You “look up” the name, and the phone book tells you the number that has been assigned to that name. From there, you make your call, or in our case, your connection.

How often do new phone books come out? What happens if someone changes his or her phone number before the new phone book is printed? Just like people change their phone numbers, computers change their addresses all the time. Some, in the case of dial-up networks, might change their addresses every day, or even several times a day. If our name-to-number system required that we all pass a new phone book around every day, our networks would come to a standstill overnight, since managing such a database of names and numbers and keeping it current for the number of computers connected together on today’s networks is impossible. DNS was designed to be easily updateable, redundant, efficient and, above all, distributed, by using a hierarchical format and a method of delegation. Just like the gateway rule, a DNS server isn’t required to know the names and addresses of every node on our networks. Instead, it’s only necessary for a DNS server to know the names and addresses of the nodes it’s managing, and the names and addresses of the authoritative servers in the rest of the hierarchy. Thus, a particular DNS server is delegated the responsibility for a particular set of addresses and is given the names and addresses of other DNS servers that it can use if it can’t resolve the name using its own information.

The hierarchical nature of DNS can be seen in the names commonly used on the Internet. You often hear the phrase dot-com and routinely use domain names when making connections. The first level of our hierarchy contains what are known as the top-level domains (TLDs). Table 1-9 contains a list of the more popular TLDs and their purpose.

Table 1-9. Popular Top-Level Domains

TLD

PURPOSE

.com

Commercial use

.net

Commercial use, though originally for use only by network

 

service providers

.org

Noncommercial use by organizations and individuals

.mil

Military use

.edu

Educational institution use

.gov

Government use

The six domains in Table 1-9 are the original TLDs. In recent years, as the Internet became more and more popular, other TLDs were created, such as .biz, .name, .info, .pro, .coop, .aero, and .museum. In addition, every country in the world is assigned its own TLD, which is a two-letter English designation related to the country name. For example, the country TLD for the United Stated is .us, while the country TLD for Ireland is .ie and the country TLD for China is .cn. Each TLD is managed by a , an organization that handles registration of domain names within a particular TLD. For some TLDs, such as .com, .net, and .org, there are many registrars. For others, such as .gov or .mil, there is only one.

A registrar is responsible for managing domain registrations. This means keeping them current, including accepting new registration requests as well as expiring those domains that are no longer valid. An example of a domain would be “apress.com”. Domain names are read right to left, with the leftmost name typically being the host name or name of a specific network node. Thus, a name such as www.apress.com  would be translated as “the network node known as www within the domain apress within the TLD .com.” Another node within the same domain might have a name such as www2.apress.com . The actual names used are up to the discretion of the owner, with the following restrictions:

  • Names are restricted to letters and numbers, as well as a hyphen.
  • Names cannot begin or end with a hyphen.
  • Not including the TLD, names cannot exceed 63 characters.

The TLDs are managed by special domain name servers known as root servers. The root servers are special servers set up in redundant fashion and spread out across the Internet. The root servers are updated twice daily by the registrars. As of this writing, there are 13 root servers. You can see a list of their names, and the organizations responsible for them, at http://www.root-servers.org . Even though there are 13 root servers, there are actually more than 13 physical servers. The root servers are typically set up redundantly in diverse locations to spread out the load, and in general the closest one to the node making the request is the one that answers the request. Each root server has a list of the active domains and the name servers that are responsible for those domains. Remember that DNS is hierarchical and delegated. The root servers have a list of subordinate domain name servers, each one responsible for one or many domains such as apress.com or yahoo.com or google.com . Thus, the root servers for .com have a list of the servers handling a particular domain. A request for that particular name is handled by the server responsible, not by the root servers. Further delegation is possible, because a company or other organization can have its own domain name servers.

Let’s look at an example. You would like to visit the Apress website to download the source code used in this book. Putting everything we’ve discussed so far in this chapter together, that means your web browser is the client application, and the web server at Apress is the server application. You know that to make a successful connection, you need four pieces of information: source address, source port, destination address, and destination port. You know from your list of reserved ports that web servers are typically available on port 80, and you know that your client application can use any port it likes as the source port for the request, as long as the port isn’t already in use on your own computer. You also know your own address. Thus, the only thing you need to determine before making your TCP/IP connection and web page request is the address of the web server that is accepting requests for the Apress website.

To get the address, you’ll perform a DNS lookup on www.apress.com . Note that the lookup happens transparently to the user and is performed by the client application, or the application “making the call.” The goal is to translate the name www.apress.com into an address that your web browser can use to make its network connection. In this case, the name lookup happens in a series of steps:

  1. The browser queries one of the root servers for .com and asks for the IP addresses of the domain name servers (there are usually two, a primary and a backup) managing the domain named “apress”.
  2. The root server consults its database for a name matching “apress” and, if found, replies with the list of IP addresses of the domain name servers delegated to handle those requests.
  3. The browser queries one of the servers in that list and asks it for the IP address of the network node with the name of “www”.
  4. The domain name server managing the domain named “apress” consults its database for the name “www” and, if found, returns the IP address associated with that name.
  5. If the browser receives an answer from the domain name server with the IP address for www.apress.com , the browser makes a connection to port 80 at that IP address and requests the web page.

The domain name system is transparent and public. You can make DNS queries in a number of different ways, from using the host and whois commands on your Linux system to using any of the web-based query sites. Let’s walk through the query process for apress.com that we just described, using the command line. To query the root name servers for information, you use whois :

[user@host user]$ whois apress.com

After the whois command executes, you’ll see output that looks like this:

  Domain Name: APRESS.CO M
  Registrar: NETWORK SOLUTIONS, INC.
  Whois Server: whois.networksolutions.com 
  Referral URL:
  http://www.networksolutions.com
  Name Server: AUTH111.NS.UU.NET
  Name Server: AUTH120.NS.UU.NET
  Status: ACTIVE
  Updated Date: 19-apr-2002
  Creation Date: 23-feb-1998
  Expiration Date: 22-feb-2007
Registrant:
Apress (APRESS-DOM)
2560 Ninth Street
Suite 219
Berkeley, CA 94710
US
Domain Name: APRESS.COM
Administrative Contact, Technical Contact: 
  Apress (23576335O)  wanshun_tam@apress.com
  2560 Ninth Street
  Suite 219
  Berkeley, CA 94710
  US
  510-549-5930 fax: 123 123 1234
Record expires on 22-Feb-2007.
Record created on 23-Feb-1998.
Database last updated on 18-Jan-2004 22:43:05 EST.
Domain servers in listed order:
AUTH111.NS.UU.NET     198.6.1.115 AUTH120.NS.UU.NET     198.6.1.154

The information is self-explanatory. You see that the registrar used to regis ter the domain name is Network Solutions, and you see that the domain was registered in 1998 and is paid up through 2007. This is the information held at the TLD level—it still doesn’t tell you the IP address of the web server, which is what you need. The information you do have, though, includes the names and IP addresses of the name servers delegated to handle further information for apress.com, namely auth111.ns.uu.net and auth120.ns.uu. net , with addresses of 198.6.1.115 and 198.6.1.154 , respectively. Either one of those name servers can help you find the address of the web server.

The next step is to use the host command to make a specific request of one of the name servers. You want the IP address of the machine with the name www.apress.com :

[user@host user]$ host www.apress.com auth111.ns.uu.net
Using domain server:
Name: auth111.ns.uu.net
Address: 198.6.1.115#53
Aliases:
www.apress.com has address 65.215.221.149

The host command takes two parameters: the name you want to translate into an IP address and the name (or address) of the server you want to use to make the translation. Using one of the name servers returned by the whois query, you ask for the IP address of the web server, and the name server responds with the IP address 65.215.221.149 . At this point, your web browser would have the four pieces of information it needs to make a successful TCP/IP connection. Incidentally, you can see the port information for DNS in the name server’s reply shown previously. Note the #53 tacked onto the end of the name server’s IP address. As shown earlier in Table 1-3, the port used by DNS is port 53. The host command can also be used without specifying the name server as a parameter. If you just use the name that you want to translate, you’ll get an abbreviated response with just the IP address, without the other information.

You can use the host command to make all sorts of domain name queries by using command-line parameters such as –t for type. For example, using a type of “MX” will return the IP addresses of the machines handling mail for a given domain name. Using a type of “NS” will return an abbreviated version of the whois information, listing the name servers themselves. Let’s see which machines handle mail and name serving for linux.org :

[user@host user]$ host -t mx linux.org linux.org mail is handled by 10 mail.linux.org.
[user@host user]$ host -t ns linux.org linux.org name server ns.invlogic.com. linux.org name server ns0.aitcom.net.

The two queries tell you that mail for addresses in the linux.org domain is handled by a machine named mail.linux.org . Likewise, the name servers for linux.org are listed. If you wanted to send mail to someone at linux.org , you would use the name server information to resolve the name mail.linux.org  into an IP address and make your connection from there. A list of common DNS record types and their purpose is shown in Table 1-10.

Table 1-10. Common DNS Record Types

TYPE

NAME

PURPOSE

A

Host Address

A 32-bit IP address identifying a specific

 

 

host.

CNAME

Canonical Name

A name used as an alias for an A record.

MX

Mail Exchanger

The name of the host acting as a mail

 

 

exchanger for the domain.

NS

Name Server

The name of an authoritative domain name

 

 

server for this domain.

Table 1-10. Common DNS Record Types (continued)

TYPE

NAME

PURPOSE

PTR

Pointer

A record that points an IP address to a name.

 

 

This is the reverse of an A record.

SOA

Start of Authority

A multiple-field record specifying particulars

 

 

for this domain, such as timeouts.

As you can see, DNS information is public information, and you can easily obtain it once you know what to look for and which commands to use. On your Linux system, use the command to get more information on and . Older systems use a utility called , which performs essentially the same functions as It’s also possible to have private DNS information, since any Linux system is capable of acting as a name server. Many companies and organizations use both private, or internal, DNS and public, or external, DNS. Internal DNS is used for those machines that aren’t available to the public.

Summary

In this chapter, we discussed the basic ingredients for today’s popular networking technologies. Here’s a summary of what we covered:

  • In general, networks are either packet-switched or circuit-switched, with the Internet being an example of a packet-switched network.
  • All networks need a common, physical medium to use for communications, the most popular of which is Ethernet. Ethernet uses a system of frames containing header and data portions.
  • Networks require the use of addressing so that one network node can find another network node. Addressing takes different forms, from MAC addresses used to identify physical network hardware interfaces to IP addresses used to identify virtual software addresses used by the TCP and IP network protocols.
  • The gateway rule means that a node does not need to know how to reach every other node on a network. It only needs to know how to reach the nodes on its own network, and how to reach the gateway between its own network and all other networks.
  • Using a system of protocol layering and encapsulation, IP and TCP “wrap” and “unwrap” each packet of header and data information as it travels up or down the protocol stack at each endpoint of the connection.
  • TCP/IP networks use the client-server model, in which the source of the communication is known as the client, and the server is the destination. The server is the network node providing services consumed by the client. Depending on whether the communication is a request or a response, the roles of client and server may change back and forth.
  • Because people find it easier to remember names instead of numbers, networks use name translation systems to translate familiar names into the actual IP addresses of the other network nodes. The most popular naming system is the Domain Name System (DNS), which is a collaborative, distributed, hierarchical system of managing namespaces where specific responsibilities for certain domains are delegated from root servers to subordinate servers.
[gp-comments width="770" linklove="off" ]

antalya escort bayan antalya escort bayan Antalya escort diyarbakir escort