Heartbeat Classification
4.1 Network Traffic Overview
In this section, we provide a statistical overview of both our aggregate traffic and periodic traffic datasets. The aggregate traffic summary establishes the norm for our network by considering all network traffic as a whole. This provides context for the interpretation of the periodic traffic on edge networks. The periodic traffic summary allows us to develop a baseline understanding of how periodic traffic differs from general network traffic.
4.1.1 Aggregate Traffic
The aggregate traffic from which we distill network heartbeats is complex. As Table 3.2 in Chapter 3 shows, heartbeats are a small component of the network traffic at any given timescale. Thus, identifying this traffic is challenging.
To convey the baseline traffic composition on our network, we provide a statistical breakdown of the aggregate traffic. We conducted our analysis at the protocol level and the protocol/port level.
(a) Protocols (b) Ports
Figure 4.1: Summary of protocols and ports for aggregate traffic.
Table 4.1: Top 10 ports and protocols observed in the aggregate traffic.
Number of Connections Protocol Port Registered Service
3,001,962,604 TCP 443 HTTPS
2,949,772,466 TCP 23 Telnet
2,144,724,698 UDP 53 DNS
1,892,680,004 TCP 80 HTTP
485,566,113 ICMP 0 ICMP Network Unreachable
376,924,831 TCP 22 SSH
328,420,663 TCP 5358 WSD
272,396,683 TCP 7547 CPE WAN
222,413,478 TCP 2323 Telnet (alt port)
Figure 4.1 provides an overview of the traffic composition on our edge network. At the protocol level, Figure 4.1(a) shows that the aggregate traffic (on a connection basis) is 73% TCP, 23% UDP, and 4% ICMP. At the port level, we classify ports as system (0-1,023), user (1,024-49,151), or dynamic (49,152-65,535) ports, based on the IANA definition1. Figure 4.1(b) shows that most of the aggregate traffic connections use system ports. In both of these figures (and throughout this thesis), red denotes UDP, blue denotes TCP, and gold denotes ICMP. For UDP and TCP, the darkest shade of their respective colour is used to represent the system port range. The shade becomes progressively lighter as the port range increases.
Table 4.1 presents the top 10 most popular port/protocol combinations in our aggregate dataset.
The majority of the popular protocol/port combinations are related to well-known services. These services are also dominated by protocols that make use of TCP, such as SSH (TCP/22) and HTTP (TCP/80). The only example of a UDP protocol (UDP/53) is DNS. DNS is a popular protocol because it is a foundational service used by many other services, such as HTTP or SMTP. The few services in the user port range are related to popular, but non-standard protocols. TCP 5358 is used by the Web Services for Devices (WSD) protocol, a plug-and-play protocol [40]. TCP 7547 is related to the Customer-Provided Equipment WAN (CPE WAN) management protocol, a remote device management protocol [9]. TCP 2323 is a popular alternative port to Telnet (for IoT devices), which normally runs on TCP 23. The popularity of these non-standard protocols is likely caused by malware looking for vulnerable machines to exploit2.
4.1.2 Periodic Traffic
Our goal is to understand the heartbeat ecosystem as a whole. To this end, we examined all heartbeats originating from or directed to our network. Similar to the aggregate traffic, we have performed statistical analysis of the heartbeats at the protocol and port level. These results are summarized in Figure 4.2.
(a) Protocols (b) Ports
Figure 4.2: Summary of protocols and ports for heartbeat traffic.
2https://securityintelligence.com/mirai-evolving-new-attack-reveals-use-of-port-7547/
The protocol breakdown for heartbeat traffic differs significantly from that of the aggregate traffic. UDP is a prominent protocol for heartbeats, comprising 49% of all heartbeats observed.
TCP accounts for 45% of the heartbeat traffic. ICMP makes up only 6% of the heartbeat traffic, but this proportion is larger than its presence in the aggregate traffic.
We investigated why the heartbeat traffic differed so greatly from the aggregate traffic. Our investigation considered a wide range of transport-level metadata, including senders, receivers, ports, transport protocol, bytes sent, bytes received, periods, and packet histories. We found that grouping heartbeats by responding port, protocol, and period was an effective way to identify similar heartbeats. We also noted that some hosts were sending or receiving anomalously many heartbeats when compared to the average. These anomalous hosts were generally specific services, such as Akamai’s CDN, and P2P applications, including P2P botnets. Both of these services produced much of the heartbeat traffic in the user/dynamic port ranges, and primarily used UDP, however Akamai also made use of TCP and ICMP. Their presence skews the protocol distribution.
The protocol/port combinations of heartbeats also differ substantially from the aggregate traffic.
TCP heartbeats were almost evenly-divided between system ports (22% of all heartbeats) and user ports (18%), unlike the aggregate TCP traffic. There is also increased activity in the dynamic port range, though it only reaches 5%. As in the aggregate traffic, TCP/80 and TCP/443 are the most popular well-known protocol/port combinations, representing 95% of all heartbeats in the system port range. Ports in the user port range vary widely, though some ports, such as TCP/5223 (used by Apple’s Push Notification Service), show significant concentrations of heartbeats.
UDP heartbeats are primarily concentrated in the user port range, rather than the system port range. UDP heartbeats on user ports represent 42% of all heartbeats. The system port range repre-sents only 3%, while the dynamic port range makes up 4%. This differs greatly from the aggregate traffic, and suggests that relatively few heartbeats are related to well-known services. UDP heart-beats are widely distributed across user/dynamic ports, but heavily concentrated on system ports 53 (DNS), 137 (NetBIOS), and 443 (HTTPS over UDP).
Table 4.2: Top 10 ports and protocols observed in the periodic traffic.
Number of Heartbeats Transport Protocol Port Service
37,887 TCP 443 HTTPS
Inspection of the ICMP traffic indicates that it is more prominent and varied in the heartbeat traffic than in the aggregate traffic. Among the ICMP heartbeats, ‘Network Unreachable’ and ‘Host Unreachable’ messages are both prominent heartbeats. To a lesser extent, but still noticeable, are ICMP ‘Port Unreachable’ messages, as a third prominent type. The volume and variety of the ICMP heartbeats suggests the presence of periodic scanning.
The most popular protocol/port combinations differ greatly from the aggregate traffic. Table 4.2 shows the most popular protocol/port combinations for our periodic traffic. Instead of being domi-nated by well-known protocols in the system port range, the majority fall into the user/dynamic port ranges. These combinations belong to user applications such as the Sality botnet (UDP/TCP 8888) or PPStream (UDP 17788). HTTP and HTTPS are still both high in popularity for periodic traffic.
This indicates that many heartbeat-generating services use a Web-based architecture. HTTP/S is also often used by developers for non-Web traffic as ports 80 and 443 are often not blocked by default like most other ports [12]. In addition, we see that ICMP 0 (Network Unreachable) and ICMP 1 (Host Unreachable) are prevalent periodic communications.