• No results found

Although there are a number of models network engineers can use to understand security, perhaps none is more useful than the OODA loop for addressing the process of managing the threat/defense cycle. The OODA loop can be applied in two ways to network security:

 In a larger, or more strategic sense: The OODA loop can be applied to the “realm of security,” in terms of where the security world is, what common threats are, and how to react to them.

 In a more immediate, or more tactical sense: The OODA loop was originally designed to handle threats in real time, so it’s a

perfect model to deal with a DDoS or other attack happening right now.

The OODA loop was first developed by a military strategist, Colonel John Boyd, in the context of air warfare. When it comes to air-to-air combat, how can one pilot gain the advantage needed to win on a consistent basis?

Breaking the process into pieces, so it can be understood and managed, would allow pilots to address each part of the process as an independent

“thing.” Understanding each step independently allows each part of the reaction process to be understood and optimized independently. The key question, when examining the OODA loop for any particular situation, is:How much of my reaction can I stage and prepare before the attack actually happens?

Figure 9-1 shows the OODA loop.

Figure 9-1 OODA Loop

Attackers want to either “get inside the loop,” or “cut through the loop,” to achieve their goals.

 Observe: If I can keep you from seeing what I’m doing (so you don’t notice at all), I have the ability to launch a successful attack.

Here we see the use of open tunnels to reach into a network, or IPv6 attacks in networks where only IPv4 is being monitored, attacks against inside interfaces, and the like.

 Orient: If I can make you think I’m trying to overwhelm your server with traffic, but I’m actually trying to install a Trojan by overflowing the input interface buffers, then you will react to the wrong problem, possibly even opening the door to the real attack.

All sorts of feints fall into the category of orient, including most social engineering attacks.

 Decide: If I can attack you when I know your decision process will be slow, I can take advantage of my speed of attack to act before you can put the proper defenses in place. For instance, a long holiday weekend at 2 o’clock in the morning might be ideal,

because the people who know what to do will be out on a remote beach enjoying their time off.

 Act: If I can prevent you from acting, or I can anticipate your action and use it against you, then I can launch a devastating attack against your network. For instance, if I can make an edge router unreachable from inside your network, I can use that moment to install a back door that allows me access later

(whenever I want). Social engineering often uses peer pressure or social norms to prevent someone from acting when they should.

Defense, in terms of the OODA loop, is to make the loop work right—to observe the attack, orient to the attack (understand what type of attack it is), decide how to act, and then to implement modifications in network policy to stop the attack from damaging the network or the business. The defense wants to contain the attacker within the OODA loop; the loop itself provides the means of controlling and addressing the attack with the minimal amount of damage possible. The “tighter” the loop, the more effective the defense against any particular attack will be.

Let’s consider each of these four steps within the realm of security, considering both the larger and narrower sense in turn.

Observe

What should you observe? To answer this question, ask another

question: What information is likely to be useful when diagnosing an attack on my network?

But doesn’t this answer depend on the type of attack? Of course—and that’s why answering the question, “What should I observe,” never has a really good answer. “Observe everything,” isn’t really practical in real life, so we leave it on the side and consider some other possibilities.

The first thing you should measure (or observe) is a general network baseline. For this first step, you want to ask yourself, “In what ways can I characterize network performance, and how can I produce a baseline of them?” Some examples might be:

 The speed at which the network converges:

o Measuring the delay and jitter of a stream as it passes across an intentionally failed path during a maintenance window.

o Determining the average amount of time it takes for an active route to move into the passive state in an EIGRP network, which can be determined by examining the EIGRP event log across a number of events on a widely dispersed set of routers within the network.

o Determining how long an average link state Shortest Path First (SPF) calculation takes. You can normally find this information through a show command on the router.

o Determining how long it takes for BGP to converge on a new path when an old path fails. You can discover this by adding a new prefix and removing it, then watching how long it takes for

specific routers to receive and act on the modifications to the BGP table.

 The rate at which changes occur in the network:

o How often do links of each type change state, and why?

o How often does external routing information being fed into the network change? The speed at which BGP routes change from your ISP, partner, or other peer can be an important piece of information to consider when you’re facing an attack against a peering edge.

 The utilization of links throughout the network:

o Although you might not want to keep track of the utilization of every link in a truly large network, being able to classify links based on what role they play, where they are in the network, and what types of traffic pass over them, and relate this classification to a “normal load,” is very important.

o A more complex—but probably just as important—measure is the utilization pattern across common periods of time. Hourly, daily, weekly, and monthly utilization rates can be very useful in determining whether there’s really a problem, or if what you’re seeing is actually normal network operation.

o Who are the top talkers in each topological area of the network, and how often do they change?

 The quality of traffic flow through the network: Applications rely on consistent jitter and delay to work correctly; you can’t know what

“wrong” looks like unless you know what “correct” looks like.

Beyond a network baseline, you should also have the following on hand:

 Layer 3 network diagrams

 Layer 2 network diagrams

 Configurations for each device

One of the most difficult things to document—and to keep current—is the policies implemented on the network. This documentation should include not only what the policy is, but where it’s implemented and what the intent of the policy is.

Note: The mechanisms used to observe these and other network measurements will be discussed in Chapter 10, “Measure Twice.”

Although this is a long list, most network engineers can recite it by heart—

because it’s all about the network itself. There is another area that most network engineers miss in the realm of observing, however:

the external security environment.

Here you can ask questions such as these:

 What attacks are other network operators seeing right now?

 What are the political and economic trends that might impact network security?

 What is the morale level within the company, and what impact will that have on the security of your network?

 What processes are in place to gain physical access to buildings, and how do these policies interact with access to network

resources? There are a lot of networks that require a password to connect to wireless access, but you can plug in to any random Ethernet jack in any building and gain access to the entire network.

 Are there political events within the company planned or in progress that will impact network security, such as a massive layoff, the relocation of a major facility, or a merger with some other company?

 Is there a sickness “going around,” and how will you respond to security incidents if half the network staff is out sick?

 If, for some reason, network operations folks can’t leave the facility for several days (in the case of a lockdown or natural disaster), is there sufficient food and sanitary supplies to support them?

 What kind of security training is supplied to the people who work on and around the network (including users of applications that have access to even moderately sensitive data)? If the training is minimal, what is the impact of social engineering attacks, and are there specific ways you can design your system to minimize these threats?

In short, you should not limit yourself to observations about the network itself when considering the overall security of your systems. External factors can often play a bigger role in your ability to secure your network than internal ones—social engineering is probably a bigger factor in network

security breaches than outright Distributed Denial of Service (DDoS) or other attacks.

In the narrower sense, you want to be able to observe a specific attack within the context of the internal baseline information and external

environment information you have on hand when the attack actually occurs.

Because you can’t predict everything you might want to measure, the key here is to be open minded and flexible. Like a carpenter who has a lot of different tools, 80% of which are never used in the course of day-to-day work, a network engineer should have a wide array of measurement tools at hand.

Quite often, the larger and narrower views of observation will involve the regular sampling of measurements combined with the ability to perform more specific, narrow measurements in a more complete way reserved for specific circumstances. An example of this might be the regular sampling of traffic patterns using NetFlow combined with the ability to span a port or insert a packet analyzer on-the-fly to measure traffic flows on an actual packet-by-packet basis.