Heartbeat Classification
4.6 Classification Discussion
Now that we have provided details on each class of heartbeat, we can develop a taxonomy. The taxonomy of heartbeats is illustrated in Figure 4.5. It relates each class of heartbeat to one another in a hierarchical structure, starting with all traffic at the root and progressing through each class discussed in this chapter. In addition, we quantify the types of heartbeat at every level of the taxonomy.
The first two levels of the taxonomy are included for completeness. The root of the taxonomy represents the aggregate traffic set. The next level down then separates the detected heartbeats from the non-periodic traffic.
The first pertinent level (the third level) makes a distinction between regular and irregular heartbeats. Figure 4.5 shows that only 0.01% of all observed heartbeats were regular. As discussed in Section 4.2, there are many causes of irregularity. This dramatic split indicates that irregularity is quite prevalent in our dataset. Further, when we analyze the full log our method detects 115,423 irregular heartbeats, but this number increases to 244,337 when we conduct analysis on daily and hourly logs. This demonstrates that our method is poor at detecting irregular heartbeats over long durations, and raises questions about the efficacy of periodicity detection in IDS software.
Aggregate Traffic
The next level of the taxonomy draws a distinction between inbound and outbound heartbeats.
Figure 4.5 shows that regular heartbeats were mostly inbound. Regular, inbound heartbeats primarily had to do with services offered by hosts on our edge network. The rest were related to scanning for specific services. Therefore, it is sensible that most of these heartbeats are inbound.
The regular, outbound heartbeats were related to external services provided by vendors.
Conversely, irregular heartbeats were mostly outbound. These heartbeats were generated pri-marily by end-user applications, especially P2P applications. The services and peers that end-user applications interact with reside outside our edge network. Thus, outbound heartbeats of this nature are normal in our network.
Inbound irregular heartbeats have a slightly different composition. Like outbound irregular heartbeats, inbound heartbeats are made up, in part, by peers attempting to contact other peers on our network. Unlike their outbound counterpart, non-P2P, inbound, irregular heartbeats were mainly service providers attempting to contact clients on our network. A relatively small number of these heartbeats also had to do with scanning of our network.
The final two levels of the taxonomy are defined by the properties of liveness and application
architecture (P2P or not).
Both regular and irregular outbound heartbeats have similar liveness statistics. In both classes, outbound heartbeats are mostly alive. This is sensible based on the observations above. Although churn in P2P networks causes many dead heartbeats (especially compared to centralized services), one would expect that in an effective P2P network most hosts would produce alive heartbeats.
Likewise, for services to be useful, it stands to reason that client machines would have to be able to reach servers reliably. Thus this liveness split is what one would expect for both P2P and non-P2P applications.
However, P2P applications generate the bulk of the outbound, dead heartbeats. As mentioned above, churn in P2P networks is the primary culprit for this difference. This is accentuated by the fact that our Akamai node produced 5,877 of 7,107 dead, outbound, and non-P2P heartbeats4. Another contributor to the large number of dead P2P heartbeats is the fact that many peers could be positioned behind firewalls that would block P2P communications.
Regular and irregular inbound heartbeats have some statistical similarities and differences. As shown in Figure 4.5, the majority of the inbound, irregular heartbeats are dead. The larger pro-portion of dead, inbound, irregular heartbeats is also sensible considering that many are related to P2P applications and service vendors. As stated earlier, churn and firewalls would be an obstacle for P2P applications. Similarly, services could fail to contact end-user machines due to the nature of many University subnets. Stated plainly, a user could establish a heartbeat with a service then simply move to a different subnet (in a BYOD environment), or even shut the application off.
Regular, inbound heartbeats do not share this liveness split. The majority of regular inbound heartbeats are alive. The magnitude of the difference is due to the fact that there are no regular P2P heartbeats. However, irregular inbound heartbeats are still more often dead than alive, even when ignoring P2P heartbeats. This difference is due to a higher amount of unsuccessful scanning in the irregular category. The regular, inbound, dead heartbeats are primarily related to services failing to contact client machines, and unsuccessful scanning. The alive counterpart is primarily related
4
to successful periodic scanning and heartbeats involving University hosted services.
4.7 Summary
In this chapter, we have constructed a classification of network heartbeats. We first developed a context for our work by providing a statistical summary of aggregate and heartbeat traffic. We then presented the four properties that allow for the classification of heartbeats: regularity, direction, liveness, and application architecture. While doing so, we discussed the classes that are derived from these properties, as well as how each class manifests in our dataset. Finally, we presented our completed taxonomy, quantified each type, and discussed the relationships between classes. The next chapter examines the characteristics of heartbeats in more detail.