Data Fragmentation - Linux - VPN - A Technical Guide to IPSec Virtual Private Networks

Fragmentation is the process of breaking a packet into multiple packets to accommodate transmission technologies/media. The process typically occurs on the host sending the packet, or the routers transmitting the packet from one environment to another. After fragmentation, each smaller packet is transmitted separately and reassembled at the destination.

The network topology being utilized places restrictions on the size of the frame that can be transmitted. The following are some examples of topologies and their maximum transmission unit (MTU):

The requirements for fragmenting a packet are based on the inability of the network layer to control the size of the datagram. Fragmentation can cause some problems with performance. The simple act of transmitting more frames has an obvious derogation of performance. However, if a single fragment is lost or damaged during transmission, the entire original packet must be resubmitted.

When a packet is fragmented, the end of the packet payload is provided a new IP header with the same attributes of the previous IP header. Lets say that a UDP packet of 1473 bytes is transmitted over Ethernet. The IP header is 20 bytes, the UDP header is 8 bytes, and therefore the final packet is 1501, 1 byte larger than Ethernet can handle in the frame payload. The final byte of information (8 bits) is stripped off and given a new IP header. The original IP header is provided a fragmentation offset value (there is a 13-bit fragmentation offset field in the IP header) of 1480. This number is derived from the end of the original IP header and any options. Thus, it

includes the UDP header of 8 bytes and the remaining user data of 1472 (minus the 1 byte for the second fragment). As the IP packets are received by the final destination, the IP headers are reordered and processed. When the IP protocol stack sees the offset value, it moves back through the collected data to the specified mark and finds the remaining byte of information. Once the network layer has assembled the UDP data, it is passed to the upper layers for continued processing.

Normally, IPSec is not affected by fragmentation because it operates at the network layer and fragmentation typically occurs after IP process. Therefore, fragmentation is executed on the datagrams to which IPSec has already been applied. Upon receipt of a fragmented packet, a system will reassemble the packet before providing it to the upper layers for processing; therefore, IPSec is only processed on whole datagrams.

In standard operations, as datagrams are generated by a host system it is aware of the local interface’s MTU and the upper layer creates packages of the appropriate size given the local information. As the packet is processed and transmitted, it may encounter a router that supports the destination network. However, the other network’s MTU is smaller than the originating network. If the Don’t Fragment (DF) bit is set in the IP header, the router sends an ICMP message of type 3 and code 4 (fragmentation needed and DF set) that notifies the originating host that the MTU is too large. Otherwise, the router fragments the packet and forwards to the destination. In a perfect situation, the host will maintain a set of MTU statements with regard to the routes used. This allows the host to build datagrams of proper size to avoid fragmentation on the other end of the communication.

The IPSec process does have an effect on the underlying MTU; by the introduction of extra headers and extended data, the datagram created by a host’s upper layer is enlarged. Typically, a host will maintain MTU information at the transport layer to create a stateful MTU agreement between communicating hosts. As IPSec is implemented between the two hosts, it will inform the upper layers, the transport layer, to reduce the datagram an amount to accommodate the information that is to be added by the IPSec operations. As MTU information is learned from the network layer (from ICMP messages), the IPSec implementation is checked to calculate the MTU that should be reflected to the transport layer. This is a necessary process because different networks and destinations may have different policies that dictate the use of the available security suites. In some cases, more options provided to the communication can increase the size of the IPSec payloads, resulting in further reduction of the datagram that must be generated by the upper layers. In fact, the IPSec SPD must be consulted to determine MTU information that is passed up to the transport layer.

As a packet is forwarded across a network and reaches a router that cannot forward it, due to fragmentation and the DF bit is set, the router sends the ICMP message with the first 64 bits (8 bytes) of the original message (excluding the original IP header — that would be redundant) to the originating system. However, IPSec communications represent an interesting situation.

Consider the scenario shown in Exhibit 7-22. There is a host at either end of a network (H1 and H2), that has various MTUs along the communications path. The network connection is provided by three routers (RA, RB, and RC) and a VPN is established between RA and RC. RA and RC have a policy that states that all traffic destined for either network where the hosts reside is to be tunneled in an ESP IPSec VPN.

A packet is generated by H1 that is 500 bytes destined to H2. As the packet arrives at RA, it has IPSec applied, resulting in a packet of 520 bytes, and forwarded onto the next hop router RB. Router RB receives what appears to be a standard IP packet (remember that RB is unaware of the VPN) and quickly determines that the exit

interface supports an MTU of only 296, which requires fragmentation. However, at some point, probably at the host, the DF bit was set, forcing RB to send an ICMP message with the first 64 bits of data to the originating IP address. The originating IP address in this example is RA, because RA tunneled the original packet and placed a new IP header on the IPSec payload. If ESP is employed, the resulting message returned to the originator is a new IP header, followed by the ICMP header, and the SPI and IPSec sequence number from the payload of the original IP packet. This is because the ICMP message only allowed for the first 64 bits. With AH as the security protocol, the SPI is returned along with the next payload and size of the original AH payload.

This situation creates some interesting issues. The first issue that immediately arises is that when the ICMP message reaches RA, it does not know the originating system. It simply receives a packet with an SPI, not an IP address, which simply represents the SA that is used to protect all traffic from the network H1 is on to the network H2 is on.

Therefore, how does RA know which host, from the network it has an SA for, sent the original packet? Of course, we know because we can see that there is only one host n the example, but RA has no clue. There are two options to RA for handling this situation:

1. Send the ICMP MTU message to all the possible hosts that can be associated to the identified SPI. Depending on the SPD and SAD and the defined networks, this could include a single host, a range of IP addresses on a network, or every IP address on a specific network. In reality, there is an even worse scenario, wherein a wildcard is used to identify all traffic as interesting and to have IPSec applied, resulting in every possible host within shouting distance receiving an ICMP message — that could be bad.

2. Save the MTU and SPI information and wait for a host to make a request to the offending SA. As a packet from an internal host is received that is greater than the saved MTU associated with the offending SA, drop it, and generate an ICMP message based on the data collected from RB and the first 64 bits of information from the dropped packet, and then send it to the host. Basically, RA becomes an ICMP proxy for RB and notifies internal hosts using the SA — identified by the SPI that was returned in the original ICMP message — to reduce their MTU.

Exhibit 7-22. A network that may require fragmentation.

H1

RC

RA RB

H2

MTU

1492 MTU 576 MTU 296

VPN

MTU 4464

Of the two basic operations, the latter works in every situation, whereas the former can become problematic. Therefore, storing the SPI and SA MTU and providing it to the internal hosts as requests are made is a required feature for IPSec implementation.

An alternative that can be used to avoid some of these issues is for RA to not set the DF bit in the outer IP header although the inner header DF bit is set. The result is that the data will be immediately fragmented upon entry into RB, ultimately eliminating the entire ICMP process.

In document Linux - VPN - A Technical Guide to IPSec Virtual Private Networks (Page 159-162)