This book excerpt is from Chapter 2 of the second edition of Firewalls and Internet Security: Repelling the Wily Hacker . This chapter, titled "A Security Review of Protocols: Lower Layers," examines some of the basic protocols from a security perspective. It is posted with permission from Addison-Wesley Professional.
In the next two chapters, we present an overview of the TCP/IP protocol suite. This chapter covers the lower layers and some basic infrastructure protocols, such as DNS; the next chapter discusses middleware and applications. Although we realize that this is familiar material to many people who read this book, we suggest that you not skip the chapter; our focus here is on security, so we discuss the protocols and areas of possible danger in that light.
A word of caution: A security-minded system administrator often has a completely different view of a network service than a user does. These two parties are often at opposite ends of the security/convenience balance. Our viewpoint is tilted toward one end of this balance.
2.1 Basic Protocols
TCP/IP is the usual shorthand for a collection of communications protocols that were originally developed under the auspices of the U.S. Defense Advanced Research Projects Agency (then DARPA, later ARPA, now DARPA again), and was deployed on the old ARPANET in 1983. The overview we can present here is necessarily sketchy. For a more thorough treatment, the reader is referred to any of a number of books, such as those by Comer [Comer, 2000; Comer and Stevens, 1998; Comer et al., 2000], Kurose and Ross [2002], or Stevens [Stevens, 1995; Wright and Stevens, 1995; Stevens, 1996].
A schematic of the data flow is shown in Figure 2.1. Each row is a different protocol layer. The top layer contains the applications: mail transmission, login, video servers, and so on. These applications call the lower layers to fetch and deliver their data. In the middle of the spiderweb is the Internet Protocol (IP) [Postel, 1981b]. IP is a packet multiplexer. Messages from higher level protocols have an IP header prepended to them. They are then sent to the appropriate device driver for transmission. We will examine the IP layer first.
2.1.1 IP
IP packets are the bundles of data that form the foundation for the TCP/IP protocol suite. Every packet carries a source and destination address, some option bits, a header checksum, and a payload of data. A typical IP packet is a few hundred bytes long. These packets flow by the billions across the world over Ethernets, serial lines, SONET rings, packet radio connections, frame relay connections, Asynchronous Transfer Mode (ATM) links, and so on.
There is no notion of a virtual circuit or "phone call" at the IP level: every packet stands alone. IP is an unreliable datagram service. No guarantees are made that packets will be delivered, delivered only once, or delivered in any particular order. Nor is there any check for packet correctness. The checksum in the IP header covers only that header.
In fact, there is no guarantee that a packet was actually sent from the given source address. Any host can transmit a packet with any source address. Although many operating systems control this field and ensure that it leaves with a correct value, and although a few ISPs ensure that impossible packets do not leave a site [Ferguson and Senie, 2000], you cannot rely on the validity of the source address, except under certain carefully controlled circumstances. Therefore, authentication cannot rely on the source address field, although several protocols do just that. In general, attackers can send packets with faked return addresses: this is called IP spoofing. Authentication, and security in general, must use mechanisms in higher layers of the protocol.
A packet traveling a long distance will travel through many hops. Each hop terminates in a host or router, which forwards the packet to the next hop based on routing information. How a host or router determines the proper next hop is discussed in Section 2.2.1. (The approximate path to a given site can be discovered with the traceroute program. See Section 8.4.3 for details.)
Along the way, a router is allowed to drop packets without notice if there is too much traffic. Higher protocol layers (i.e., TCP) are supposed to deal with these problems and provide a reliable circuit to the application.
If a packet is too large for the next hop, it is fragmented. That is, it is divided into two or more packets, each of which has its own IP header, but only a portion of the payload. The fragments make their own separate ways to the ultimate destination. During the trip, fragments may be further fragmented. When the pieces arrive at the target machine, they are reassembled. As a rule, no reassembly is done at intermediate hops.
Some packet filters have been breached by being fed packets with pathological fragmentation [Ziemba et al., 1995]. When important information is split between two packets, the filter can misprocess or simply pass the second packet. Worse yet, the rules for reassembly don't say what should happen if two overlapping fragments have different content. Perhaps a firewall will pass one harmless variant, only to find that the other dangerous variant is accepted by the destination host [Paxson, 1998]. (Most firewalls reassemble fragmented packets to examine their contents. This processing can also be a trouble spot.) Fragment sequences have also been chosen to tickle bugs in the IP reassembly routines on a host, causing crashes (see CERT Advisory CA-97.28). I
IP Addresses
Addresses in IP version 4 (IPv4), the current version, are 32 bits long and are divided into two parts, a network portion and a host portion. The boundary is set administratively at each node, and in fact can vary within a site. (The older notion of fixed boundaries between the two address portions has been abandoned, and has been replaced by Classless Inter-Domain Routing (CIDR). A CIDR network address is written as follows:
In this example, the first 25 bits are the network field (often called the prefix); the host field is the remaining seven bits.)
Host address portions of either all 0s or all 1s are reserved for broadcast addresses. A packet sent with a foreign network's broadcast address is known as a directed broadcast; these can be very dangerous, as they're a way to disrupt many different hosts with minimal effort. Directed broadcasts have been used by attackers; see Section 5.8 for details. Most routers will let you disable forwarding such packets; we strongly recommend this option.
People rarely use actual IP addresses: they prefer domain names. The name is usually translated by a special distributed database called the Domain Name System, discussed in Section 2.2.2.
2.1.2 ARP
IP packets are often sent over Ethernets. Ethernet devices do not understand the 32-bit IPv4 addresses: They transmit Ethernet packets with 48-bit Ethernet addresses. Therefore, an IP driver must translate an IP destination address into an Ethernet destination address. Although there are some static or algorithmic mappings between these two types of addresses, a table lookup is usually required. The Address Resolution Protocol (ARP) [Plummer, 1982] is used to determine these mappings. (ARP is used on some other link types as well; the prerequisite is some sort of link-level broadcast mechanism.)
ARP works by sending out an Ethernet broadcast packet containing the desired IP address. That destination host, or another system acting on its behalf, replies with a packet containing the IP and Ethernet address pair. This is cached by the sender to reduce unnecessary ARP traffic.
There is considerable risk here if untrusted nodes have write access to the local net. Such a machine could emit phony ARP queries or replies and divert all traffic to itself; it could then either impersonate some machines or simply modify the data streams en passant. This is called ARP spoofing and a number of Hacker Off-the-Shelf (HOTS) packages implement this attack.
The ARP mechanism is usually automatic. On special security networks, the ARP mappings may be statically hardwired, and the automatic protocol suppressed to prevent interference. If we absolutely never want two hosts to talk to each other, we can ensure that they don't have ARP translations (or have wrong ARP translations) for each other for an extra level of assurance. It can be hard to ensure that they never acquire the mappings, however.
2.1.3 TCP
The IP layer is free to drop, duplicate, or deliver packets out of order. It is up to the Transmission Control Protocol (TCP) [Postel, 1981c] layer to use this unreliable medium to provide reliable virtual circuits to users' processes. The packets are shuffled around, retransmitted, and reassembled to match the original data stream on the other end.
The ordering is maintained by sequence numbers in every packet. Each byte sent, as well as the open and close requests, are numbered individually. A separate set of sequence numbers is used for each end of each connection to a host.
All packets, except for the very first TCP packet sent during a conversation, contain an acknowledgment number; it provides the sequence number of the next expected byte. Every TCP message is marked as coming from a particular host and port number, and going to a destination host and port. The 4-tuple
(localhost, localport, remotehost, remoteport)
uniquely identifies a particular circuit. It is not only permissible, it is quite common to have many different circuits on a machine with the same local port number; everything will behave properly as long as either the remote address or the port number differ.
Servers, processes that wish to provide some Internet service, listen on particular ports. By convention, server ports are low-numbered. This convention is not always honored, which can cause security problems, as you'll see later. The port numbers for all of the standard services are assumed to be known to the caller. A listening portis in some sense half-open; only the local host and port number are known. (Strictly speaking, not even the local host address need be known. Computers can have more than one IP address, and connection requests can usually be addressed to any of the legal addresses for that machine.) When a connection request packet arrives, the other fields are filled in. If appropriate, the local operating system will clone the listening connection so that further requests for the same port may be honored as well.
Clients use the offered services. They connect from a local port to the appropriate server port. The local port is almost always selected at random by the operating system, though clients are allowed to select their own.
Most versions of TCP and UDP for UNIX systems enforce the rule that only the superuser (root) can create a port numbered less than 1024. These are privileged ports. The intent is that remote systems can trust the authenticity of information written to such ports. The restriction is a convention only, and is not required by the protocol specification. In any event, it is meaningless on non-UNIX operating systems. The implications are clear: One can trust the sanctity of the port number only if one is certain that the originating system has such a rule, is capable of enforcing it, and is administered properly. It is not safe to rely on this convention.
Figure 2.2: TCP Open: The client sends the server a packet with the SYN bit set, and an initial client sequence number CSEQ. The server's reply packet has both the SYN and ACK packets set, and contains both the client's (plus 1) and server's sequence number (SSEQ) for this session. The client increments its sequence number, and replies with the ACK bit set. At this point, either side may send data to the other. |
TCP Open
TCP open, a three-step process, is shown in Figure 2.2. After the server receives the initial SYN packet, the connection is in a half-opened state. The server replies with its own sequence number, and awaits an acknowledgment, the third and final packet of a TCP open.
Attackers have gamed this half-open state. SYN attacks (see Section 5.8.2) flood the server with the first packet only, hoping to swamp the host with half-open connections that will never be completed. In addition, the first part of this three-step process can be used to detect active TCP services without alerting the application programs, which usually aren't informed of incoming connections until the three-packet handshake is complete (see Section 6.3 for more details).
The sequence numbers mentioned earlier have another function. Because the initial sequence number for new connections changes constantly, it is possible for TCP to detect stale packets from previous incarnations of the same circuit (i.e., from previous uses of the same 4-tuple). There is also a modest security benefit: A connection cannot be fully established until both sides have acknowledged the other's initial sequence number. 4